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Background 



Silvcrbrook's bilithic Mcmjet™ printhcads are the target printheads for printing systems 
which will be controlled by SoPEC and MoPEC devices. 

This document presents the fonmat and structure of these printheads, and describes the 
their possible arrangements in the taigct systems. It also defines a set of terms used to dif- 
ferentiate between the types of printheads and the systems which use them. 



Currently, this document is only concerned with the structure of the printheads and their 
systems, with regard to the way in which dot data is loaded. 

Refer to the Bilithic Printhead Specification [1] for the complete description of the func- 
tionality of these devices. 

This document relies on certain definitions and details presented in Bilithic Printhead 
Specification [1]. 



It is intended that this document be used as a reference for engineers involved in the 
design work on the SoPEC and MoPEC projects. 



1.1 



Companion Documents 



1.2 



Readership 
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2 Definitions 

This document presents terminology and definitions used to describe the bilithic printhead 
systems. These terms and definitions are as follows: 

• Printhead Type - Thm are 3 parameters which define the type of printhead used in a 
system: 

• Direction of the data flow through the printhead (clockwise or anti-clockwise, with 
the printhead shooting ink down onto the page). 

• Location of the left-most dot (upper row or lower row, with respect to K+ ). 

• Printhead footprint (type A or type B, characterized by the data pin being on the left 
or die right of V^^ where K,. is at the top of the printhead). 

• Pripthea^i An^ficment - Even though there are 8 printhead types, each arrangement 

has to use a specific pairing of printheads, as discussed in Section 3. This gives 4 
pairs of printheads. However, because the paper can flow in either direction with 
respect to the .printheads, there are a total of eight possible arrangements, c.g. 
Anangement 1 has a Type 0 printhead on the left with respect to Che paper flow, and 
a Type 1 printhead on the right. Arrangement 2 uses the same printhead pair as 
Arrangement 1, but the paper flows in the opposite direction. 

• Cp)or 0 is always the first color plane encountered by the paper. 

• DgtSt is defined as the nozzle which can print a dot in the left-most side of the page. 

• ThP Ev^n Plane of a color corresponds to the row of nozzles that prints dot 0. 

Note that throughout tHis document, where the various printheads and systems are pre- 
sented, the prindieads alwavs shoot ink down onto the page. 

Figure 1 shows the 8 different possible printhead types. Type 0 is identical to the Right 
Printhead presented in Figure 3 in [1], and Type 1 is the same as the Left Printhead as 
defined in [I]. 
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M^ile theprimheads shown in Figyre 1 look to be of equal width (having the same number 
of nozzles) it is important to remember that in a typical system, a pair of unequal sized 
printheads may be used 
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Figure 1. Printhead Types 0 to 7 

Table 1 defines the printhead pairing and location of the each printhead type, with respect 
to the flow of paper, for the 8 possible arrangements. 

Table 1, Definition of the difTerent printhead arrangements 



^^^^^^^^ 
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3 Bilithic Printhead Systems 

When using the bilithic printheads, the position of the power/gnd bars coupled with the 
physical footprint of the printheads mean that we must use a specific pairing of printheads 
together for printing on the same side of an A4 (or wider) page, e.g. we must always use a 
Type 0 printhead with a Type 1 printhead etc. 

While a given printing system can use any one of the eight possible arrangements of print- 
heads, this document only presents two of them. Arrangement 1 and Arrangement 2, for 
purposes of illustration. These two arrangements are discussed in subsequent sections of 
this document. However, the other 6 possibilities also need to be considered. 

The main difference between the two printhead airangcments discussed in this document 
is the direction of the paper flow. Because of this, the dot data has to be loaded differently 
in Arrangement 1 compared to Anrangement 2, in order to render the page correctly. 
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3-1 Example 1 : Printhead Arrangement 1 

Figure 2 shows an Arrangement 1 printing setup, where the bilithic printheads are 
arranged as follows: 

• The Type 0 printhead is on the left with respect to the direction of the paper flow. 

• The Type 1 printhead is on the right. 



Type 0 Printhead 



lype 1 Printhead 




The printheads are facing downwards. 
The ink is being shot down onto the page. 



Gnd 



ff 



Direction 
of Paper Flow 



Figure 2. Identification of printheads nozzles and shfft-reglster sequences for 
printfieads In Arrangement 1 
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Table 2 lists the order in which the dot data needs to be loaded into the above printhead 
system, to ensure color Oniot 0 appears on the left side of the printed page. 

Table 2. Order in which the even and odd dots are loaded for printhead Arrangement 



^^^^^^^^^ 






Odd 


Loaded second in 
descending order. 


Loaded first in 
descending order. 


Even 


Loaded first in 
ascend ino order. 


Loaded second in 
ascending order 



Figure 3 shows how the dot data is demultiplexed within the printheads. 
Type( 



i 0 Printhead Type 1 Printhead 



Data[l]- 



Data[0]. 




.Data[0] 



-Data[l] 



Figure 3. Demultiplexing of data wfthln the printheads in Arrangement 1 

Figure 4 and Figure 5 show the way in which the dot data needs to be loaded into die print- 
heads in Arrangement 1, to ensure that color 0-dot 0 appears on the left side of the printed 
page. 



Data[0] 

SrClk n-nJTJlJTJnJTJTJTJ^^ 



Figure 4. Signalling for a Type 0 printhead in Arrangement 1 



Data[0] 

Figure 5. Signalling for a Type 1 printhead In Arrangement 1 
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3,2 Example 2: Printhead Arrangement 2 

Figure 6 shows an Arrangement 2 printing setup, where the bilithic printheads are 
arranged as follows: 

• The Type 1 printhead is on the left with respect to the direction of the peeper flow. 

• TTie lype 0 printiiead is on the right. 



"ITie printheads are facing downwards. 
The ink is being shot down onto the page. 



Type 0 Printhead 



t t 

Direction 
of Paper Flow 
V+ 



Type 1 Printhead 




Gnd 

Figure 6. Identification of printheads nozzles and shift-register sequences for 
printheads in Arrangement 2 
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Table 3 lists the order in which the dot data needs to be loaded into the above printhead 
system, to ensure color 0-dot 0 appears on the left side of the printed page. 

Table 3. Order in which the even and odd dots are loaded for printhead Arrangement 
2 
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Loaded first in 
descending order. 


Loaded second tn 
descending order. 


Even 


Loaded second in 
ascending order. 


Loaded fir^t in 
ascending order 



Figure 7 shows how the dot data is demultiplexed within the printheads. 
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Figure 7. Demultiplexing of data within the prlntheade In Arrangement 2 

Figure 8 and Figure 9 show the way in whidi the dot data needs to be loaded into the print- 
heads in Arrangement 2, to ensure that color 0-dot 0 appears on the left side of the printed 
page. 

Figure 8. Signalling for a Type 0 printhead In Arrangement 2 



SrClk "OJTJTJIJTJTJTJTJT-^^ 

Figure 9. Signalling for a Type 1 printhead in Arrangement 2 
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3.3 



Conclusions 



Comparing the signalling diagrams for Arrangement 1 with those shown for Arrangement 
2, it can be seen that the color/dot sequence output for a printhead type in Arrangement 1 
is the reverse of the sequence for same printhead in Arrangement 2 in terms of the order in 
which the color plane data is output, as well as whetho* even or odd data is output first. 
However, the order within a color plane remains the same, ix. odd descending, even 
ascending. 

From Figure 10 and Table 4. it can be seen that the plane which has to be loaded first (i.e. 
even or odd) depends on the airangement Also, the order in which the dots have to be 
loaded (e.g. even ascending or descending etc) is dependent on the arrangement. 

If the device controlling the printheads can rc-order the bits according to the following cri- 
teria, then it should be able to operate in all the possible printhead arrangements: 

• Be able to output the even or odd plane first 

• Be able to output even and odd planes in either ascending or descending order, inde- 
pendently. 

• Be able to reverse the sequence in which the color planes of a single dot are output to 
the printhead. 



V. Arrangement 1 ^ 



1± 



Paper 



^ Arrangement 3 v> 



Arrangement 5 ^ 



Paper 



Y* Arrangement 7 



Paper 




Arrangement 2 




— ^ 

Arrangement 4 




Arrangement 8 



Paper 



Figure 10. All 8 Printhead Arrangements 
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Table 4. Order in which even and odd dots and planes are loaded into the various 
printhead arrangements 



li^i^.H^n'9ejnen^^ 




jj^iglfc^^^ 


Arrangement 1 


Even ascending loaded first 
Odd descending loaded second 


Odd descending loaded first 
Even ascending loaded second 


An-angement 2 


Odd descending loaded first 
Even ascending loaded second 


Even ascending loaded first 
Odd descending loaded second 


Arrangement 3 


Odd ascending loaded first 
Even descending loaded second 


Even descending Joaded &st 
Odd ascending loaded second 


Arrangement 4 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


An'angement 5 


Odd ascending loaded first 
Even descending loaded second 


Even descending loaded first 
Odd ascending baded second 


Arrangement 6 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Arrangement 7 


Even ascending loaded first 
Odd descendir>g loaded second 


Odd descending loaded first 
Even ascending loaded second 


Arrangement 6 


Odd descending loaded first 
Even ascending loaded second 


Even ascending loaded first 
Odd descending loaded second 
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Bi-lithic Printhead Specification 



1.0 Basic Requirements 

by "Stitching" reticle images, 
ketas 1600 dpi. 

. u- have a 32 um horizontal offset from the final 

The first nozzle of the nght chip should h^e ^^eTb m, ink nozzle overlap (of the 
nozzle of the left chip for the same color row. There 
same colour) scheme employed. 

1.1 Power Supply 

lengt of the chips, b« » »ffl b« «v,8.ted). 

1.2 MEMSecBs 

during this pulse. 
1^.1 ISSUE!!! 

per li^ WiO. 1 »sec fe putae eyde, '•'^J^^^l ^ 138 n<«*s a a 
(looldng ahead) We have 13824 aozdes acrass the page, we 
time. That is about 8 Amperes if all nozzle fire. 

That is 8 Amperes is for or., 1 colour! ,6A • 6 colours - 96 A for all colours. 
Hov^many colours couldprin.a..hesametime.CJ^ooloj^^^^^ 

„1 a. the time are required, to create m^ f??i^^of taftaRed ir*. 

gr^d).Bu..heflxa«Veirft^-^^^ 



1.2.2 64um unit cell height 

This cell would have 4 line spacing 
between adjacent colours. 



between the odd and even dots, and 8 line spacing 



123 80 um unit cell height 

This ceU»ouldhav.5«nespa=u,gb..««»*eodd„d«»<lo«.a«^10Hn. spacing 
between adjacent colours. 

1.3 Versions 

1.3.1 6 Colour 1600 dpi with 64 um unit cell 

Left and Right Chip. TTiis version wiU not be prototyped. 

1.3.2 6 Colour 160Q dpi with 80 um unit ceU 
Left and Right Chip. 

1.3.3 4 Colour 800 dpi with 80 um unit ceU 

For camera application. Single nozzle row per colour. 
This version wiU not be prototyped. 

1.4 Air Supply 

AirmustbesuppliedtotheMEMSrcgion through holes in the chip. 

2.0 Head Sizes 

TABLE 1. He ad Combinations 
Left Head 

Nozzles per Colour 
11160 
9744 
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row 



1 R+104^*12. Nozzles per 
ISl^oi manages «. avoid to se., wiftou. any loses. 



3.0 Interface 



TABLE 2. I/O pins 



Name 
OatafO^lJ 



ataLfO-l] 



ReadL 



I/O 



F unction 

(DataL the complemcntaiy signal). colomsl0.21 on 
DatafOI. colour[3-51 onData[ll 



Feedback for CMOS testing (LSyncL^l. ReadL^) 
and {LSyncL^, ReadL^) 

0] - nozzle test result 

;i] - temperatuic 



Common 
?4o 



complementary signal of Data[0-ll 

Feedback for CMOS testing {JLSyncL=^\.ReadL^) 
and {LSyncL-^, ReadL^) 
[0] - nozzle test result 

(1] - te mperature 

5^ta shift clock using Differential Signalling 
(SiClkL t he complementary signal) 
complementary signal of SrClk 



No** 



600** 



DatalO-n/DataLtO-l] in output mode (driving non^f- 
ferential) — , 



Yes 



Fire patter n shift clodc 
Pulse Profde for all colours 
0 - Capture dot data for next print line 
TFlctionallycLdbeLn^on.butfortin.ng/elecmc al..a^^ 

b. 300 MHz clock, so edges arc 600 Mhz rate „„.i-sOns 
c 1 MHzcycle.butthe«solutionofthemark/spaccrat,omayrequ.«50ns. 

d. 10 kHz cycle, with minimum low pulse of 10 ns (no maximum). 

controller (SOPEC). 



I 



3 



3.1 Dot firing 

To fire a nozzle, three signals are 
signals are high, the nozzle wUl fire 

FIGURE 1. Print head stracture 



need. A dot data, a fire signal, and a profile. When all 



o. 




and docked toto ttw dup "f '^^et the dots are shifted into the 

^SS «1 toe the dot p«.em m the ^ot latch .s been fixed. 
Ao^thetopof-coltu^ofno^es — »2no^«.-^^ 

^S^d. one n>gis«« bit in e«h dheeti«. flow. 



FIGURE 2. Column Structure 



Column M 




SrCIki 



DotI3>> 
SrClkQ ^ 



l 

Dotlll. 

DOt[0] 
SiCIko' 



. A- eaioM Shift Reqister that runs the length of tiie 

The select register forms the Select f "^yJ/^^S'*^ reaisters is used to enables 

selects the reverse direction fire register. 



. V -.^«t Shift redster dot mapping to page 
FIGURE 3. Print headdotsmn rega 

i Paper Movement 
ink shooting out of page t^-^^'*/^^^^^^^^^ heads 
Reader looVclng through paper over p 




section A-A Through Even noxzles 




n-2 n-6 

_ n " 



— — w — ^ Rid 




Left Print Hwd - (n-m) nozB« 

• the foUowing data Streams wiU need to provided. 
With this mapping, the fouowmg 

^ . . Head Combina -- ^'-^ ^^"'^^ ^IThtHead 

Left Head 
I dot order 

Isizej n-m 



rr^T border A^nln 3 5 ..4075,4077,40 /^.l f ' 

' ' <AQft line y 



(CO. CI . C2)....) m the aDOve o . 
pulses (and 3L+1 rising edges). 



FIGUSE 4. Data Timing During Printing 




3 3 Fire Shift Register 

1 r ifthereeisterisftillof'l'sthen 

that(4800A)! 

CI sWft register ^ -•''f^l^^^ZT^^^^^-'^^' 



FIGURE 5. Print quaUty 




nn77le at the same time starting 
This is done by n™g 2 - ^^"^^t^S ""^ 
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To achieve this fire pattern the fire shift register and select shift register need to 
be set up as show in Figtire 6. 



FIGURE 6. Fire and Select Shift Register setup for printing 



• •OOOOOOO, ...OOOlllllll-.-. 1110000000. ...0001111111 111 select 8lii£t roo 



1 fire 8hi£t 

register 



The pattern has shifted a * 1' into the fire shift register every n^^ positions (where n is 
usually is a minimum of about 100) and fi * Ts, followed w *0's in the select shift 
register. At a start of a print cycle, these patterns need to be aligned as above, with the 
"1000.. of a forward half of fire shift register, matching an n grouping of T or 
*0's in the select shift register. As well, wiA the "1000..." of a reverse half of the 
fire shift register, matching an n grouping of T or 'O's in the select shift regis- 
ter. And to continue this print pattern across the butt ends of the chips, the select 
shift register in each should end with a complete block of n * Ts (or *0*s). 

FIGURE 7. Fire Pattern across butt end of Print Chips 



. . .1110000000 . . . •000X111111. . • .111 
ZiOft Print Head Plre/Seleet 



XlUlll. . • .1110000000 • . 0001111111 
aigbt Print Head Fire/Select SR 



Since the two chips can be of different lengths, it makes initialisation of these pattern 
diflRcult. This is solved by building initialisation circuitry into chips. This circuit is 
controlled by to registers, nlen(14) and count(14) and b(1). These registers are 
loaded serially through DatafOJ, while LSyncL is low, and ReadL is high with FrClk. 



FIGURE 8. Fire Pattern Generation 




The scan order from input is b, n[13-0],c[0-13], therefore b is shifted in last. 
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The following table shows the values to programme the bi-lithic head pairs using a fire 
TABLE 4. Head Combinations Initialisation for /i=100 



Nozzles 
La 


Nozzles 
Lb 


nlen(A&B)=' 

M-l 


county = 
(La/2) mod n 
-1 


bA 


be 


rem^ 
(Lq/I) mod n 


' counts = 
(Lj^'L,^-^rem) mod n 
-1 


9744 


4080 


99 


71 


0 


0 


40 


3 


8328 


5496 


99 


63 


0 


0 


48 


79 


6912 


6912 


99 


55 


0 


0 


56 


55 



and once the registers are initialised with LA FrClk cycles (ReadL='0*, LSyncL='r). 
rem would be the correct value for countB if chip B was only clocked (FrClk) Lb 
times. But this chip will be over clocked I^a^^b cycles. The values of by^ and bg are 
either the same or inverse of each other. The actually value does not matter. They need 
to be different from each other if the select shift registers would end up with differ- 
ent values at the butt ends. If (L^i^n) is even (and county^ is non zero), then the fmal 
run in * A's select shift register will be ! b^. If (L;^-Lb/2) mod n is even (and countB is 
non zero) then the final run in *B's select shift register will be ! bs* 



FIGURE 9. Determining Select Shift Register value 

HcftdA 



^ l.jJ2 select shift register length 



count^-t-l 



HeadB 



-►Lb 



II 






< 


► 


select shift register length 







. countB<i-1 
-b0 



3.4 Profile Pattern 

A profile pattern is repeated at FrClk rate. It is expected to be a single pulse about lus 
long. But it could be a more complicated series of pulse. The actual pattern depends on 
the ink type. 

The following figure show the external timing to print a line of data. In this example 
the line is printed in 8 cycles of FrClk, 
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FIGURE 10. Timing for printing Signals 
-i 

LsyncC y ' ~~ y 

ReadL ^ ' ~ 



Data 



SrClk 



FrClk 



Pr 



j"ijmimji_rijM_ri_rm. 



n_r 



3.5 Interface Modes 



The print heads a eight different modes controlled by signals ReadL and LSyncL. As 
seen in Figure 9 with bofli LSyncL and ReadL high, the chip in normal printing mode. 
Some of these mode can operate at the same time, but may interfere with the result of 
the other modes. 



TABLE 5. Print Head Modes 



ReadL 


LSyncL 


Mode 


Internal 
Mapping 


1 


1 


Normal Print Mode 


SiClk=SrClk/3 

frclk-FiClk 

Seiak<=0 

FsClk=FiClk 

Scan=0 

CoreScanN) 


X 


0 


Dot Load Mode 

• Dot latches are open, loaded with Dot shift regis- 
ters, latch once LSyncL returns to 1 (this happens 
regardless of ReadL) 

• Enables Dot Shift register to capture fire result. 




1 


0 


Fire Load Mode 

• Data[0] will shift through nien, count and b with 
FrClk 


SrClk«X 

frclk-X 

SelClk=X 

FsClk=FrClk 

Scan=l 

CoreScan>^X 
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TABLE 5. Print Head Modes 



ReadL 


LSyncL 


Mode 


Internal 
Mapping 


0 


1 


Reset Nozzle Test 

• Resets the state of nozzle test circuit 


Srak=SrClk 

FiClk=FrCIk 

SelClk=FrClk 

FsClk=FrClk 

Scan*=0 

CoreScan^l 


0 


1 


CMOS testing mode 

• The contents of the dot shift registers are serial 
shifted out on Data[0-l] with SrClk 


0 


I 


Fire Initialise mode 

• The contents of the fire shift register and select 
shift register is generated with FrClk 


0 


0 


Temperature Output 

• The series of Delta Sigma output are clocked out on 
DatafO] with FrClk. The sxun of these bits represent 
the temperature of the chip. 


SrClk=X 

frclk^O 

SelClk=0 

FsClt=0 

Scan^O 

CoreScan=X 


0 


0 


Nozzle Test Output 

• The result of a nozzle test is output on Data[ 1 ]. 



3*5.1 Printing 



Figure 10 shows show timing for normal printing. During this action, we drop out of 
Normal Print Mode^ to Dot Load Mode between line transfers. For printing to perfomi 
correctly, no other signal should be stable. 

3.5.2 Initialising for Printing 

To initialise for printing the fire shift registers and select shift registers need to setup 
into a state as shown in Figure 7. To do this the chips are put into Fire Load Mode and 
the values for nien, count and b are serially shifted from Data[0] clocked by FrClk. 
As the two chip have separate Data line, and common FrCIk, this happens at the same 
time. Once this is done, mode is changed to Fire Initialise Mode, and fiirther jprClk 
cycles are provided to both chips. During all these operation Pr should be low, to pre- 
vent unintentional filing for nozzles. 
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FIGURE 11. Initialising Print Heads 
LsyncL 



ReadL 

DataAtO] ( bA, lnen[13^)), count[0-13U > - 



DataaP] Q 
SrClk ~ 

FrClk 
Pr 



bp, IncnjlS^O], count[0-13]B ) - 



Fire Load Mode 



La cycles 



Fire Initialise Mode 



3.5.3 Nozzle Testing 

Nozzle testing is done by firing a single at a time a monitoring the DataflJ pin in the 
Nozzle Test Output mode. 

Each nozzle has a test switch with closes when it nozzle is fired. All 12 switches in a 
nozzle column are connect in parallel to the following circuit. 

FIGURE 12. Nozzle Test Latching Circuit 



Testout 




This circuit is initialised when ever LSyncL is high and ReadL is low (Reset Nozzle 
T est mode). This forces all "switch nodes" to low, and the feedback through lower NOR 
gate will latches this value. With LSyncL low and ReadL still low (Nozzle Test Output 
mode) the Testout of the first nozzle column is output on Data [1 J. If any switch is 
closed, the switch node of this colmnn will be pulled up, and will ripple through to the 
output as transition fi-om high to low. 
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FIGURE 13. Nozzle Testing 
LsyncL 



ReadL 

SrClk^g~ 
FrClkm^ 



Pr 



Set up Test 



Reset Nozzle Test Mode 



Nozzle Test Output 
Mode 



Set up 
Test 



Nozzle testing requires a setiq> phase in order to fire only one nozzle. There are many 
ways to achieve this. Simplest might be to load a single colour with 101010 through the 
even nozzles, and 010101 for the odd nozzles (O's for all other colours), and set up a 
fire pattern with n = hjjl. With this fire pattern only one nozzle will fire in each Pr 
pulse. After firing in Nozzle Test Output mode, a single FrClk will advance to next 
nozzle, then Reset and Test. After cycles of this testing, a single SrClk\^ 
advance the dot shift registers to setup the imtested nozzles of this colour, and another 
La/2 cycles of FrClk, Reset and Test will finished testing this colour. Then repeat test 
procedure for other coloxirs. 



3.5.4 Temperature Output 

This mode is not well defined yet In this mode, DatafOJ will output a series of ones 
and zeros clocked by FrClk, After a (currently unknown) number of FrClk cycles the 
sum of this series will represent the temperature of the chip. Clocking fi-equency in this 
mode it expected to be in the range lOkHz • IMHz. 
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FIGURE 14. Temperature Reading 
LsyncL | 



ReadL L 
DatalO] — Q 

SrClk 

FrClk 

Pr 



The Frequency of FrCIk and the number of cycles need to be programmable. Since this 
mode cycles FrClk, the result of fire shift register and select shift register would be 
changed, but in this mode FrClk is disabled to these circuit. So printing can resume 
without reinitialising. 



3.S«5 CMOS Testing 



CMOS testing is a mode meant for chip testing with before MEMS as added to the 
chip. This mode allows the dot shift register to be shifted out on the Data [0-1] pins. 
Much like the nozzle test mode^ the nozzles are fired while LSyncL is low, but during 
the firing SrClk will be cycle, and the dot Shift register will load the signal that 
woiild fire the noz2de. Once capture, the result can be shifted out. 

FIGURE 15. CMOS Testing 
LsyncL ^ 



ReadL 

Dataj 

SrCIkl 
FrClkj 

Pr 



Set up Test 



ji 



n 



Dot Load Mode 



CMOS Test Output Mode 



The Dot Load Mode above violates normal printing procedure by firing the nozzles 
(Pr) and modify the dot shift register (SrClk). 
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4.0 Reticle Layout 



To make long chips we need to stitch the CMOS (and MEMS) together by overlapping 
the reticle stepping field. The reticle will contain two areas: 



FIGURE 16. Reticle Layout 



20 mm max. 




^^^^ 



3 mm 



The top edge of Area 2, pad end contains the pads that stitch on bottom edge of Area I, 
CORE. Area J contains the core array of nozzle logic. The top edge of Area I will stitch 
to the bottom edge of itself Finally the bottom edge of ^rea 2, butt end will stitch to 
the top edge of Area 1. The BUrr end to iised to complete a feedback wiring and seal 
the chip. 

The above region will then be exposed across a wafer bottom to top. Area 2, Area 7, 
Area 1 Area 2. Only the PAD end of Area 2 needs to fit on the wafer. The final expo- 
sure fo Area 2 only requires the butt end on the wafer. 
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FIGURE 17. Stepper Pattern on Wafer 




4.1 TSMC U-Frame requirements. 

TSMC wiUbebuildingus ftames lOnunxO.23 mmwhich will be placed either side of 
hoUxArea 1 and Area 2. 

TSMC requires 6 mm area for blading between the two f posure ^w^l^'e 
^3n^7n the reticle, as some recticles are 2x size, while most are 5x. the worst case 

miist be used. 
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1 .1 Document History 











1.6 


29 November. 2002 


Simon Walmsley 


Updated ChipA to be ChipR to match proto- 
cols document got rid of 68k reference now 
that we are using LEON. 


1.5 


26 November, 2002 


Simon V\^lm$tey 


Added description of storing more than a sin* 
gle SoPEC Jd Key In a PRINTER_QA On sec- 
lion 3.5.3 and related). This reduces the cost 
of a multi-SoPEC system vwith no loss of secu- 
rity. 

Also added text to describe that batch keys 
can be different for each SoPEC If the Indirect 
upgrade Icey protocol is used. 


1.4 


9 September, 2002 


Simon Walmsley 


Added section in requirements detaiBng types 
of attaclcs we care about aiKj don't care about. 


1.3 


30 August 2002 


Simon Walmsley 


Changed ComCo.OEM_xxxx variables Into 
c>imnk# WW variahles since that Is morO 
generic. Added text regarding ink lefiU. Added 
extra software autneniicauon siage ro provcm 
ComCos from fiddiing with SoPEC software. 


1.2 


29 August 2002 


Simon WSalmsley 


Added section on how the PRINTER_QA chip 
gets programmed with the SoPEC_W_Key. 


1.1 


28 August 2002 


Simon Walmsley 


Updated to have Ink and operating parameters 
be authenticated via symnwtric key based sig- 
natures t)3sed on a unkjue SoPEC.ki 


1.0 


27 August 2002 


Simon Walmsley 


Updated after fe\dew. 


0.2 draft 


26 August. 2002 


Simon Walmsley 


Changed publfc-key and private key refer- 
ences to asymmetric & symmetric re^jec- 
tively. so private can now sub-refer to the 
private key of the asymmetric pair, or the sin- 
gle private symmetric key. Changed OEM Jd 
into ComCo_OEM.Iicense_ld to more accu- 
rately reflect the scope of the M. 


0.1 draft 


26 August. 2002 


Simon Weimsley 


Initial issue. 
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1.3 Scope 

This document describes the basic security requirements of programs running on the 
SoPEC ASIC [11. It then describes an implementation solution to the secunty require- 
ments. 
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1 4 READERSHIP 

TOs document is written for softwan. engineers and ^^^^^ ^^^^^^^^^^ 

If^h <5oPFC as well as PCB designers ±at are responsible for SoPEC-basea trinx 

E^nef A^i^^ ^^^'""^^ ""^^ 

document useful. 

Tins document is also intended to be read by those responsible for key management and 
associated database designers with regaixjs to guiding requirements. 

This document is confidential to Silverbrook Research 

side this organisation maa be covered by a non-disclosure agreement (NDA). 

15 OA Chip TtRMiNOLOGY 

The Authentication Protocols document 15] refers to QA Chips by their function in parfc- 
ular protocols: ^ . u r\A 

. For authenticated reads. ChipR is the QA Chip being «ad ^om^^ ChipT is the QA 

Chip that identifies whether the data lead fiom ChipR can be trusted. 
• For replacement of keys. ChipP is the QA Chip being I«°erammed^^e new^y. 

and CUpF is the factoiyQACaiip that generates the message to program the newkey: 

. For upides of data in memory vectors. ChipU is the QA Chip bemg upgraded, and 

Chips is the QA Chip that signs tiie upgrade value. 
Any given physical QA Chip will contain ftactionality that allows it to operate as an 
entity in some number of these protocols. 

as defined in [5]. 

pj™«>-/ OA Chios are referred to by their location. For example, each ink cartri^e may 
PIUNTER^QA. and will be on a separate bus to the INK.QA chips. 
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2 Requirements 



2.1 Security 

The basic fiinctional security requirenients are: 

. SUverbrook code and OEM program code co-existing safely 

. saveibrook opeiaiing parameters authentication 

. OEM operating parameters authentication 

• Ink usage authentication 

Each of these is outlined in subsequent sections. 



2.1.1 



The authentication requirements imply thai: 

ro^Tandend-uL must not be able to replace or tamper withSilverbrookprogr^ 

. ^'^^end-users must not be able to call unauthorized functions .ithin SUver- 

. rll'^rnustnotbeabletoreplaceort^nperv.thOEMprogr^^^ 
. Sd-usen;mustnotbeabletocallunauthori2edfunct.onsv«tlnnOEMprogram^^^ 
. Sir«st be able to test products at their highest upgradable status, yet not be able 
to ship tiicm outside the terms of their license 

Sh^S;«i of operating system permitted GPIO pms and tuners. 
SUverbrook code and OEM program code coexisting safely 



activated. 



* A«« fnr <:oPFC is a form of protection management, whereby Sil- 

be restricted to Silverbrook program code only. 
212 Silverbrook operating parameters authentication 
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program code. 

However the OEM must be capable of assembly-.ine t^g the Print Engine at the 
;^;^dS ^ before selling the Print Engme to the end-user. 

2.1.3 OEM operating parameters -tHenticaUon ^ 

s^or^zsr«rrs^:-o-^^^ 

end-user Should not be abieto upgr^the o^^^^^ 

appropriate fee to the OEM. Similarly. ^^^^"^-"''^s^Je c Ibis implies that end-users 
:S«^ication mechanism via any p™^^^ on S^^ 

2.1.4 inkusngeautheoMeaUoi. . .™rfi„.„,b«sB»!siiKKlel.For<»™pl». 

E»h OEM sdU pri.«n "tf "."^"S^S'om^y sdl a= sm. 

of OEM2 printers can only use OEMj ink. 

o«le etc. It is impossible to guani against such an attack. 

we ^ reaUy o.y concerned -J^" ^^u^a^^^^^ 

of printer operating P^^^^.'^^S'^I'^hX oS Saccd by one that can be down- 

sJh an attack is where the SiWeibrook pn«t>«g <>^S « «P'^ J ^^t^ide the 

loaded fiom the i-t^-^^^l'^^.^^^^^^^^ Sp^by a hl« or by I rogue OEM is 
of the license agreement. 



1. a franking machine prints stamps 
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ciii^^thev can be transmitted via the inter- 

of a legitemate iq>grade. 
rity. 

" "-^rrrrx::---"— ------- 

tation constraints. These are: 

• No flash memory inside SoPEC 

• SoPEC must be simple to verify 

. Silverbiook program code must be updatcable 

. OEM program code must be updateable 

. Must be bootable fiom activity on USB or ISl 
. Noextn.pim,forassigningrostosUveS^^^^ 

. Cannottn««fl»ecomms^d o*eQA^P> J^^^^^ 

• Cannot trust the comms channel to tne ^aip 

• Cannot trust the ISI comms diannel 

These constraints are detailed below, 
few bits. 

2 3i SoPEC must be simple to verify ^^^^^d 

All combinatorial logic and ^r^^^j;^^ ^Jtb^^^f^th^'SrSses verification 
before manufacture. Every mcrease m complexity m ei 

effort and increases risk. 
• finished in time for SoPEC manufacture 



Cor^idential 



November 29, 2002 



7 



Silverbrook Research SoPEC Security Ovefview 



Therefore the complete Silverbrook program code must not P^^^^^y^J^ 
SoPEC It must be possible to update the Silverbrook program code as enhancements to 
functionality are made and bug fixes are applied. 

In the worst case, only new printers would receive the new fimcHonality or 
Se be J^. e«^ng SoPEC users can download new embedded code to enable fancnon- 
Xor 'fixes. llally. these same users would be ^f^^^^f"" ^» 

OH^ website or equivalent, and not require any mteraction with Silverbrook. 

2 3 4 OEM program code must be updateable 

Given that each OEM wiU be writing specific program code for Pnnte^fl^ ^^oJeC S 
been conceived, it is impossible for all OEM program code to be embedded in SoPEC at 
the ASIC manufacture stage. 

Since flash memory is not available (see Section 2.3.1) OEMs ^^^^ 
code in on-chip flash. While it is theoretically possible to store ^EM prograin code in 
ROM on SoPEC this would entail OEM-specific ASICs which would be p«>hn„t.vely 
expensive. Therefore OEM program code cannot per7«fl«ent()' reside on SoPEC. 

Since OEM program code must be dowiiloadable for SoPEC to execute, it should &«e- 
fo^be p^siLVupdate the OEM program code as enhancements to funcnonahty are 
made and bug fixes are applied. 

In the worst case, only new printers would receive the new functionality or bug fixes. In 
Se* e« cJe ^sting SoPEC users can download new embedded code to enable fonction- 
Sity o^ bug fiTerideally. these same users would be ^^^^ 
Oevl website or equivalent, and not require any interaction with Silverbrook. 

2.3.5 Must be bootable from activity on USB or ISI 

SoPEC can be placed in sleep mode to save power when printing is "^^^^S^ll'^Jtoi 
not oreserved in sleep mode. Therefore any program code and data m RAM will be lost 
How^olSSbe«q«ble of being woken^ 

again. 

In the case of a single SoPEC system, the host communicates with SoPEC via USB. 

to the case of a multi-SoPEC system, the host typically commumcates with *e ISI ivtart^ 
Sip(eT the ISI Master could be SoPEC. and the comms is USB), and can send mes^^^ 

?^Jti« slave SoPECs viathe ISI master. The ISI master SoPEC relays these messages to 

the slaves via the ISI. 

Therefore SoPEC must be capable of being woken up by activity on either the USB or on 
AelSI. 

2 3 6 No extra pins to assign IDs to slave SoPECs 

fa a single SoPEC system the host only sends data to the single SoPEC. However in a 
midti S^PEC system; each of the slaves needs to be uniquely identifiable m order to be 
able for the host to send data to the correct slave. 

Since there is no flash on board SoPEC (Section 2.3.1) we a« unable to store a slave ID 
(eg 4 bits) in each SoPEC. Moreover, any ROM in each SoPEC will be identical. 
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'X "sSS^sLs. W. ^v. 2 pins r« i».«-S.PEC commumo.- 

tions, and further pins would add to the cost. 

a 3 7 Cnno. «»« .h. comm. ch»™. «. «« OA Chh. In *- prt-r (PRINTER C^) 

^ ,«™.f^n: are Stored in the non-volatile memory of the Pnnt 
If the r^*^ JP^Sf^QA chTp SrSilverbrook and OEM program code cara>ot 
Engine's on-board PRINTEl^QA cwp. oossible for an end-user to replace 

rely on the communication channel being secure. It Possime o 
toe PRINTER_QA chip or subvert the communications cham,el. 

2 3 8 cannot trust the comms channel to the OA Chip In the ink cartridges (INK OA) 
2.3.8 l^annox iru&i u v . . v ^^r^Aa^ U stored in the non-volatile mem- 

oty of that ink cartntlgc 5 WK^QA teinl! sKure >< " P°ssiWe f« " 

to...Mltl-Sc.reC«^"^^«^<"^5^'"^SI. I. is quite possible to » 
man-in-tfic-middlc attacks). 
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3 Proposed Solution 

A proposed solution to the requirements of Section 2. can be sununarised as: 

• Each SoPEC has a unique id 

« CPU with uscr/supcTvisor mode 

• Memory ManagementUnit 

• SoPEC ISI identification 

n Each SoPEC HAS A UNIQUE ID 

3.1 tACH ouf- njnimum size 64-bits'. This 

Each SOPEC needs to ^'^^tj^^^^f^^"'^^^ 
SoPEC Jd is used to form a symmctnc unique lo e«.u o 

T»e vc^fic^n of .P<».i« P»»«- ^^.il^^SSlSSSt-^^ » 

» aaemi... Difficult » "'"^.^^S^bl^ chips o. the b««J. 

»i« lie M »!• so»«««. or by vewmg tfcc ="™^Xe on spoolfie lot pins on tie 

chip, then dcpendmg on the case oy wmcu «" 

,, is i^pon™. to oo.. th. in the P~P°»J^t?>S^.^^»^''^rS 

3 2 CPO WITH USER/SUPERVISOR MODE 

SoPBC «»taios , CU f-S^.^S^"^-. » 

V8 instmction set). 

^ ««.oram «i>dc ivUl rim in suiHsrvisor mode, and all OEM 
Silverbrook (operating system) program code wui run m 
program code will run in user mode. 

3 3 Memory Management Unit 

DRAM by defining read, wntc and fl. . ^ ^ ^ ^ode permission settings. 



\ . On roM-s CUl 1 process this cbipid is 80 bits. 
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*u«* rtf npcess fe E read, write, execute) is 

mitted. 
DRAM 

sMiccssed by user code. These registers win . , 

M ^ ,t 0x0000 0000 to support programmable exception 
The embedded DRAM should start '^^^^It^^,^-^ Lislated in the MMU to 

vector.. The reset ex^J-ve <-tJ°S2ry that still allows null pointer 

point to the appropnate location in RUM. laeauy 

dereferencing to be tr^ed. 

r»B AM «,d PEP subsystems of SoPEC. typicaUy we would set us«r 
With respect to the DRAM and PEF s^^^yf);^ - ^ of memory that is used 

read/write/execute mode pernmsions to a^d 0/0/0 eTsewhcre. By 

for OEM program data. 1/0/1 fo^^g*""^ "/JS^ SSt^ex^nite pemussions forthis 

contrast we would typically set ^"P^;!^^*!^:^ "Se in sup^visor mode), 
memory to be l/l/O (to avoid accidentallyexecutmg user c 

/ c-^f^,^n 1 U should only be accessible in supervisor mo^ 



access. 



3 4 Specific ENTRY POINTS IN O/S 

implementation for this depends on the CPU. 

o. *e LEON ^'^■'^^^"'■^^s^::^'^.'^- 

and supervisor mode ui a controlled way. inc code soace in supervisor 

user mode. 

updates occur. 

TTie LEON also aUowssupervisormodecode to <jl.user mode code, •n.creareanumber 
Of ways that this functionality can be unplemcntco. 
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3.5 Boot Procedure 

3.5.1 Basic premise 



ZZLn is to .oad the Silverbrook ar.d ^^^^^^^^J^ mStS 
RAM. where it can be ^l>s«l«<^f y f^^^i^^^^ guaAntee that only 

ble of dovmloading program code. H°*«^« al^o^^tlSwise ^e could mod- 
authorized Silverbrook boot programs .'^^^i^jL^!^^ lice,u;ed oper- 
ify the O/S to do anything, and then download that - thereby bypassmg h. *~ 

ating parameteis. 

we perform authentication of program code and data using asymmetric cryptography and 
wir/io«r using a QA Chip- noAiu 
Assuming we have already downloaded some data and a 160-bit signature mto eDRAM. 
the boot loader needs to perform the following tasks: 

SHA-1 on the downloaded data to calculate a digest localDtgest 

: ^zZ^T.^ » *. "«>-^'»' " 

passed to the downloaded data 

key must be held in SoPEC's ROM. If symmetnc private keys are useo, 
probed and the security is compromised. 

The procedure requires the following data item: 
• bootOfcey = an «-bit asymmetric pubUc key 
The procedure also requites the following two functions: 

. SHA-1= a function that performs SHA-l on a range of memory and returns a 160-btt 
. d<Spt . a function that performs asymmetric decryption of a message using the 

passed-inkey , *v 

Assuming that all of these are available (e.g. in the boot ROM), boot loader 0 can be 
defined as in the following pseudocode: 



boo« loader 0( da t:a# si9) 

localDigest 4- SHA-1 (da t*) 
i^^Sn.-!edDiaest ^ decrypt (slg. bootOkey) 



Else 



// program code is unauthorized 
BndXf 



from some hacker in Norway). 
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In the case of RS A. ^ 2048-b.t key « requir ^ ^^^^^ ^^^^ 

of the QA Chip. Iti the case of ECDSA. a Key lengui 

^ „ ™,.itinle kevs in SoPEC and having the external 
There is also no advanUge to stonng ^''^rbecaLe a compromise of any key allows 

hootOkey secure. 



ify and characterize 

3.5.2 



Hierarchies of authentication 

Given that test prognuns. evaluation P^^^'^^^^^^^^L^ •^\rJcu» to 
written and tested, and OEM program code ^^'^^^^^'^u.erWk 0/S. non-O/S. 
have a single authentication of a "^""^^^^^."S-s^gmng SUverbrook program 

code. 

code contains the key for authenticating the next 

For example, assume that we have the following entities: 

. ComCo,aooaW<""^«^^S^,^,^ 

. OEM. a company that uses a Print a^e to^^ 

end-users. The OEM would supply the motor control logic 
m levels of authentication Weraxchy are as foUows: ^^^.^^^ 

. SoPECCo generates '^'^'^^-^^^^^ri^^^^X^y^^y- ^^^^^ 
ttie print engine functtonaUty) and the "^^^o^^ ^y. The print engine 

is an operating parameter block for a given OEM s P"" » asymmetric private 
the priTengine license ^^-^^^^g v^lid p^^ 
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c^te as many of these operating P-meter bloc*, for ^num^r of Pnnt Engine 
Licenses, but cannot write or sign any supervisor O/S p^gram 

. The OEM would generate 

dataset4 is the OEM program code vnA the OEM s asy^n 

The OEM can produce as many versions of datasetS as it UKes (t.g. 
poses or for updates to drivers etc) 

The relationship is shown below in Figure 1. 



dataseti 
(supplied to 
ComCo) 



datasetS 
(supplied to 
OEM) 




datasetS 
(suppled to 
end-user) 



Figure 1. Relationship between the datasets 

c^ppr itcelf validates dataseti via the bootOkey mech- 
When the end-user uses datasetS ^^^^'"^^^^^ ^ vaHdates dataseti. and 

( SoPEC boot rofn \ 

1 Onclutles bootOkey public toy) j 



validation via bootOkey 



dataseti: operattng systarn 
(indudes ComCo public key) 



validation via ComCo key 



tfataset2: oparatlng parnis 
Cindudes OEM pubfk: key) 



validation via OEM key 



datasat4: OEM program code I 



Figure 2. Validation hierarchy 
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mask ROM change in SoPEC to fix. 

„ is ^rth^Hile repeating .hat in any hierarchy '^^ --"^^^^^^^^^^ f ^to 
hosed or, keeping the asynunetric private ^y P-^ ^ '^^'s\sZ^hJoslnn.etric 
requirement that the program that signs (t.e. authon.es) datascts usmg 
private key paired to bootOkey secure. 

3 5 3 Authenticating operating parameters 
ment of host O/S drivere etc. 

On PRINTER.OA, memoiy vector Mo contains the upgradable operating paraineters. and 
21^^ M.rconti any constant (non-upgradable) operafng parameters. 

Considering only SUv«brook operatiBg parameters for the moment, there are actuaUy two 

problems^^^^ and storing the Silverbrook operating parameters. M*ich should be 

authorized only by Silvetbrook 

t.«ading the parameters into SoPEC. which is issue of If^^^^"^^ - 
on L PRINTEF^QA chip since we don't trust PRINTER^QA. 

The PRIKTER^QA chip therefore contains the following symmetric keys: 

Tjtprc a J.V Tbis kw is unique for ««» SoPEC (!«e Secora 3.1), »«1 » 

anything. , . 

^Kiom Tt i<5 onlv used to authenticate the actual upgrades of ttie 
Ko is used to solve problem (a). It is oniy usea lo a ^^^^ upgrade protocol 

ing as the ChipS. 

'Z^^ „,i«er i. SOPEC-S toe. imt wh«. a. Sn. p.ge -mves. 
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Note that the procedure for verifying reads of daU from PRINTER_QA does not rely on 
S whSok^s This means tlut precisely the same mechanism can be used to read 

by Silverbrook supervisor code so that SoPECJdJey is not revealed. 

If the OEM also requires upgradable parameters, we can add an "ty^ J° 
PrSJtE^QA. where that key is an OEM.key and has wnte penn.ss.ons to the OEM 

part of Mo- 

In this way.K, never needs to beknown by anyone except .he SoPEC and PRINTER.QA. 
nrinrincr SoPEC in a muIti-SoPEC system need access to a PRINTER_QA chip that 

extra keys (multiple SoPEC Jd_keys) to a smgle PRINTER_Q A. 

However, if ink usage is not being vaUdated (e.g. if print speed -^^J.^^^^^^^^ 
^gradableparamet2tl.n^t^^^^^ 

m:trsS^S^^^^^P- Sfthen the PHI within the fi^ (or o^y^ 
sSS:^ be programmed to accept (or generate) line sync pulses at a part.<> 
^?«Tlf?tees^ «riv5 fester than the particular rate, the PHI would generate a 
?uffeTL^ S^fw^i mrthat even if L motor speed was backed to be ^, the 
print will terminate. 

3.5.3.1 OEM assembly-line test 

A. ^<crihed in Section 2.1.2. Silverbrook operating parameters include such items as 
^^•ntr^n^t qSS et. aAd are tied to a license provided to an OEM. These paiame- 
S^°'rSr SiSS^^k controL Tl.e licensed SUverbrook operating parameters are 
stored in the PRINTER^QA as described m Section 3.5.3. 

However although an OEM should only be able sett the Uce.Bed<veratmgparamete^^^ 
a ^r^r^fne. they must be able to assembly-line test' the ^ 
foLnt set of opeLing parameters i.e. a maximally upgraded Prmt Engine. 

Several different mechanisms can be employed to allow OEMs to ^st ^« "P^jjj^^"^ 
b.^rt«ofL Print Engine. At present it is unclear exactly wbatkindof^^^ 

wo\4^ ^ peiibrmed. 

At first thought, it might be considered that a donglcstyle approach using a spedal marter 
contSing upgraded parameters might be -^-^"^--^ ^^^^ 
SOPEC to a2:ept the parameters as true, the sp^i^ 'SS'^JSS'K s^T^X 

rlTmechanism. which fanplies either a Silverb«K,k box or a com,ect.on to a Sdver- 
brook machine (e.g. over a net). Neither approaches are good. 
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. . ou TMXPR OA for testing, then we must make use of special 

If there is no special "^^^^^^^^I^^ Z '^lk The solution will depend on the 
test programs, or storage on the PRINl KK^VJa, 
test requirements of the OEM. 

would not want the OEM to have such a program. 

ml only io« tWs Omsf ths tmmg «' ""^ » So ««S tea im.g«. This ™y 

toaito ihal looK uid betaves like a leal Pniit Eiig.«c ,. / j^„„ end of 

^,imo<l»p«l,li«,u.»"»>c«"»l!'l»*""''"*'*«"- 

If A. OEM m,m«s l=<ts tta. .ctually prints doK, there »e several possibilWes: 
IftleOBMmi .,^-,.,.,„ejf„^oEM)thai»iUonlyprintspeaalSJver. 

OEM test patterns cannot be pnnted. 
b A version of the O/S that prinu garbage in special places over the test tmagc. 
A^^^ SisSe disadvantage that special OEM test patterns cannot be 

pnnted. decrements a DecrementOnly value in 

L^^^SunatMlupgn^c^p^i^-t^^^^^ 

PRINTER QA customization may only need to be i or 

Of these SOM..S. oplion W U P«**,y --JJ^iJ^^QTlsTs^S. 
useM. If the test P-^'«'»','^:^'l^,''Sl'^^^^;^<^f«^'''>^ 

the subsequent OEM program code « '^'^^f ^ .; '^^^^^ for different OEM's 

SoPEC only contains a single root key. f^^^f ""^.^ U prii^r driver for OEM^ run 
applications to be run identically physical Pnnt Engmes i.e. prime 
on an identically p/iy^ica/ Print Engine from OEMj. 

^ against thj the SiWerb.^^^^^ 
PrintEngmeLicense_id code J ^^^'ig^^ ^^'j J" m,+). As with all other operating 
fixed operating parameter m ^^^^^^ tuld be stored in PRINTER_QA at the 
parameters, the value of PnntEngineLicense_ta wouia 0,=. 



17 

November 29. 2002 

Confidential 



SitverbrooK Research 



SoPEC Security Overview 



4-4*1 -3 v1 .6 



same time as the other various PRINTCR_QA customizations are being applied, before 
being shipped to the OEM site. 

in this «ay. the OEMs can be snre of differentiating themselves trough software func- 
tionality. 

^ ci; Authentication of ink 

of dots printed for each inlc 

Other data stored on the INK_QA cmp ^ qe^j id, inkType. 

be stored in Mi+ within INK.QA. 

SoPEC must be able to authenticate reads ftom the INK.QA. both in terms of ink parame- 

tere as well as ink remaining. 

To authenticate ink a number of steps must be taken: 

• restrict access to dot counts ^doimtfr OA 

. authenticateinkusageandinkparameteTSviaINK_QAandPRINTER_QA 

. broadcast ink dot usage to all SoPECs in a multi-SoPEC system 

3.5.5.1 restrict access to dot counts " ^^^^^^ ,f g^pEC, access to these 

Since the dot counts are -"^^^/TEf re '^^^^ available ftom supervisor 

SeM p^g-^ Sde to clear dot counts before authentication has occuned. 

3 5 5 2 authenticate Ink usage and Ink parameters via INK.QA and PRINTER.QA 

3.5.5.2 -"'^*"4^^^^^^,,,,,,,,,^..,eationofinkre^^^^ 

J^oblemtlL we donHtn^t INK.QA. Therefore how ^ 

of ink (or the ink parameters), and how can a SoPEC know 

INK QA the count has been correctly decremented. 

INK^QA. 

WecannotwritetheSoP£C_.W_teytotheINK.QAfortworeasons: 
. . updating keys is not power-safe (i.e. if power is «movedm>d-update. the INK_QA 

could be rendered useless) 
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not know the old SoPEC_idJcey (knowledge of the old key is require 
change the old key to a new one). 
The Dtoposed solution is to let INK_QA have two keys: 

pennissions to anything. • ^« „ tr. fill 

amlreffll tie amount of inlt). Upgrades •« o„e„^ upgi«iT aaiug as 

describe i. 151, wltb '^'^^"^.".^^ SeTto cheek tta apptoprtat. » 
^^S.^ll"roSSrS*aSl«;s..eUee«eJ.f...aU.»y. 

-j!r>;a=^^c.^:rjst^:£^x"^^^^^ 

(e.g. in Ka), also with no write pennissions. 

This means there are two shared keys, with PRINTER.QA sharing both, and thereby act- 
ing as a bridge between INK_QA and SoPEC 

. Sc,P£C uf.feyissharedbetweenSoPECandPRINTER.QA 
.USoPBC^^.doisdoana^ca^^^^^ 

ture to PRINTER^QA, let PRINTER-QA ^f'™:*!^,. shared So/>fC itf Jfcey. SoPEC 
Son INK.QA mt« be valid, and ean therefote be motel 

one. the data .o. mK.QA is^o^ ^^SSt^ « oSfXS^. 

Checked, and the other ink licensing parameters sucn 
InkUsageLicenseJd can be checked for vahdity. 

The actual steps of read authentication as performed by SoPEC are: 



^ W/ si^le consT-nts .o specUy WHicl. Hey to use -.e. si,nin, 

^ ?r"^".^C°re.a,KEVX. «„„™>// re., wi.n .eyX= U.eXnK.icense.Key 

^l^ZJ^'l^^^ x„KUs.,e..cen,e_xa. o. .n. .e«»in.„. 
If (JJL-inkRem^ining = expectedlnkRenuiinlng) 

// all is ok 
Else 
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// the ink value is not what we wrote, so don't print anything anymore 
Endlf 

Else , 
// the dBta raad from INK_<}A is not valid and cannot be trusted 



Endlf 



Strictly speaking, we don't need a nonce (Rsopec) a" *e time because Ma (containing 
the ink remaining) should be decrementing between authentications. However we do need 
one to retrieve the initial amount of ink and the other ink parameters (at power up). This is 
why taking a random number from the WatchDogrtmer at the receipt of the first page is 
acceptable. 

In summary, the SoPEC performs the non-authenticated write [5] of ink remaining to the 
INK_QA chip, and then performs an authenticated read of the daU via the PRINTER_QA 
as per the pseudocode above. If the value is authenticated, ead the INK_QA ink-remain- 
ing value matches the expected value, the count was correctly decremented and the print- 
ing can continue. 

3.5.5.3 broadcast ink dot usage to all SoPECs in a mulH-SoPEC system 

In a multi-SoPEC system, each SoPEC atteu:hed to a printhead (4 at most) must broadcast 
its ink usage to all the SoPECs. In this way, each SoPEC will have its own version of the 
expected ink usage. 

In the case of a man-in-the-middle attack, at worst the count in a given SoPEC is only its 
own count (i.e. all broadcasts are turned into 0 ink usage by the man-in-the-middle). 

A single SoPEC performs the update of ink remaining to the INK_QA chip, and then all 
SoPECs perform an authenticated read of the data via the appropriate PRINTER_QA (the 
PRINTER OA that contains their matching SoPEC Jdjxy - remember that multiple 
SoPECjdJceys can be stored in a single PRINTER_QA). If the value is authenticated, 
and the INK_QA value matches the expected value, the count was correctly decremented 
and the printing can continue. 

If any of the broadcasts are not received, or have been tampered with, the updated ink 
counts will not match. The only case this does not cater for is if each SoPEC is tricked (via 
an ISI man-in-the-middle attack), into a total that is the same, yet not the true total. Apart 
from the fact that this is not viable for general pages, at worst this is the maximum amount 
of ink printed by a single SoPEC. We don't care about protecting against this case. 

Since there will be at most 4 printing SoPEC. it requires at most 4 authenticated reads. 
This should be completed within 0.5 seconds - weU within the 2 seconds/page prmt time. 

3.S.6 Example hierarchy 

The exact breakdown of hierarchy will depend on a later investigarion, but for the pur- 
poses of scoping out possibilities, it is worthwhile considering an example hierarchy for 
illustrative purposes. 
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Adding an extra bootloader step to the example from Section 3.5.2, we can break up the 
contents of program space into logical sections, as shown in Table 1 . Note that the ComCo 
does not provide any program code, merely operating parameters, that is used by the O/S. 



Table 1. Sections of Program Space 







mmiMM^smm 


0 

(ROM) 


boot loader 0 
SHA-1 function 
asymmolric decrypt function 
bootOlcey 


sectk>n 1 via bootOkey 


1 


boot loader 1 
SoPEC_OS_public.key 


section 2 vfa SoPEC_OS_public_key 


2 


SIIvert>rook O/S program code 
function to generate 
SoPEC_ld_key from SoPEC_id 
Bask: Print Engine 
ConiCo_put>lic_key 


sectton 3 via ComCo_public_key 

section 4 via OEM_pubIic_key (supplied in sec- 
tion 3) 

PRINTER_QA data, which Includes the 
PrinlEngineUcense^kJ, Silvertjrook operating 
parameters, and OEM operating parameters (all 
authenticated via SoPEC_kI_key) 


3 


ComCo Ifeense agreement operat- 
ing parameter ranges, including 
PrintEngtneUcense.id (gets 
loaded into supervisor mode sec- 
tk)n of memory) 

OEIVI_public.key (gets loaded into 
supervisor mode section of mem- 
ory) 

Any ComCo written user-mode 
program code (gets loaded into 
mode mode section of memory) 


ts used by section 2 to verify seclk>n 4 and 
range of parameters as found in PRINTER^QA 


4 


OBM spedfic program code 


OEM operating p>arameters via calls to Silver- 
brook O/S code 



The verification procedures will be required each time the CPU is woken up. since the 
RAM is not preserved. 



3.5.7 What If the CPU fs not fast enough? 

In the exaniple of Section 3,5.6, every time the CPU is woken up to print a document it 
needs to perform: 

• SHA-l on all program code and program data 

• 4 sets of asymmetric decryption to load the program code and data 

• I HMAC-SHAl generation per 512-bits of Silverbrook and OEM printer and ink oper- 
ating parameters 

Although the SHA-1 and HMAC process will be fast enough on the embedded CPU (the 
program code will be executing from ROM), it may be that the asymmetric decryption 
will be slow. And this becomes more likely with each extra level of authentication. If this 
is the case (as is likely), hardware acceleration is required. 

A cheap fonm of hardware acceleration takes advantage of the fact that in most cases the 
same program is loaded each time, with the first time likely to be at power-up. The hard- 
ware acceleration is simply data storage for the authorizedDigest which means that the 
boot procedure now is: 



Confidential 



November 29. 2002 



21 



Sifvertrook Research 



SoPEC Secunty Overview 



4-4-1-3 VI .6 



«loivC91l_bootloadarO(data, slg) 

localDigest SHA-l<data) 

If (localDigest: = previous lyStoredAuchorizedDigest) 

juxi^ to program code at data-start address// will never to return 
Else 

authorizedDigest «- decrypt (sig, bootOkey) 

I£ (localDigest e authorizedDigest) 

previous lyStoredAuthorizedDlgest 4- authorizedDigest 

jump to program code at data-start address// will never to return 

Else 

// program code is unauthorized 
Endif 



This procedure means that a reboot of the same authorized piogram code will only require 
SHA-1 processing. At power-up, or if new program code is loaded (e.g. an upgrade of a 
driver over the internet), then the full authorization via asymmetric decryption takes place. 
This is because the stored digest will not match at power-up and whenever a new program 
is loadecL 

The question is how much preserved space is required 

Each digest requires 160 bits (20 bytes), and this is constant regardless of the asymmetric 
encryption scheme or the key length. While it is possible to reduce this number of bits» 
thereby sacrificing security, the cost is small enough to warrant keeping the fiill digest. 

However each level of boot loader requires its own digest to be preserved. This gives a 
maximxmi of 20 bytes per loader. Digests for operating parameters and ink levels may also 
be preserved in the same way, although these authentications should be fast enough not to 
require cached storage. 

Assuming SoPEC provides for 12 digests (to be generous), this is a total of 240 bytes. 
These 240 bytes could easily be stored as 60 >c 32-bit registers, or probably more conven- 
iently as a small amount of RAM (eg 0.25 - 1 Kbyte). Providing something like 1 Kbyte of 
RAM has the advantage of allowing the CPU to store other useful data, although this is not 
a requirement. 

In general, it is useful for the boot ROM to know whether it is being started up due to 
power-on reset or activity on the USB/ISI. In the former case, it can ignore the previously 
stored values (either 0 for registers or garbage for RAM). In the latter case, it can use the 
previously stored values. Even without this, a startup value of 0 (or garbage) means the 
digest won*t match and therefore the authentication will occur implictly. 

3.6 SoPEC ISI IDENTIFICATION 

At power-up, the host can send targeted data to the USB-connected SoPEC, but can only 
send broadcasts to all of the slave SoPECs via the USB-connected SoPEC's ISI. 

Each slave SoPEC will verify the broadcast message received over the ISI, and if it is 
valid, will execute it. Several levels of authorization may occur. However, at some stage, 
this common program code (broadcast to all of the slave SoPECs and signed by the appro- 
priate asymmetric private key) will, among other things, set the slave SoPEC*s ISI id. If 
there is only 1 slave, the id is given, but if there is more than 1 slave, the id must be deter- 
mined in some fashion. 
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On a particular physical arrangement of SoPECs each slave SoPEC will have a different 
set of connections on GPIOs. For example, one SoPEC maybe in charge of motor control, 
while another may be driving the LEDs etc. The unused GPIO pins (not necessarily the 
same on each SoPEC) can be set as inputs and then tied to 0 or 1. As long as the connec- 
tion settings are munially exclusive, program code can determine which is which, and the 
id appropriately set. 

In some multi-SoPEC systems, a given SoPEC will only be attached to a single printhead 
(left or right). We can conveniently use the second printhead connection pins (temperature 
and test) to form an ISI id. 

This scheme of slave SoPEC identification does not introduce a security breach. If an 
attacker rewires the pinouts to confuse identification, at best it will simply cause strange 
printouts (e.g. swapping of printout data) to occur, while at worst the Print Engine will 
simply not function. 

Note that some physical setting (e.g. pins) on each of the multiple SoPECs is required - the 
settings just need to be mutually exclusive. Although it is possible for all the SoPECs to 
come to a logical ISI id assignment (e.g. by using ethemet-like protocols), the ISI id needs 
to be very much ^physical identity scheme. This is because these SoPECs are not simply 
logical processors - we want the correct portion of the page to be printed on the correct 
physical location, motor controls will be physically connected to a specific physical 
SoPEC etc. 

3.7 Setting up QA Chip keys 

In use, each INK_QA chip needs the following keys: 

• = SupplylnkLicenseJcey 

• Kj = UseInkLicense_key 

Each PRINTER_QA chip tied to a specific SoPEC requires the following keys: 

• Ko =» PrintEngineLicenseJtey 

• = SoPECjd_key 

• K2 = UselnkLicenseJcey 

Note that there may be more than one Kj depending on the number of PRINTER_QA 
chips and SoPECs in a system. These keys need to be appropriately set up in the QA Chips 
before they will function correctly together. 

3.7.1 Original QA Chips as received by a ComCo 

When original QA Chips are shipped from QACo to a specific ComCo their keys are as 
follows: 

• Ko = QACojComCoJCeyO 

• K| = QACoj:omCo_Keyl 

• ¥.2^ QACojComCoJCeyl 

• K3 = QACo_ComCo_Key3 

All 4 keys are only known to QACo. Note that these keys arc different for each QA Chip. 

3.7.2 Steps at the ComCo 

The ComCo is responsible for making Print Engines out of Memjet printheads, QA Chips 
PECs or SoPECs, PCBs etc. 
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In addition, the ComCo must customize the INK_QA chips and PRINTER^QA chip 
CD-board the print engine before shipping to the OEM. 

There are two stages: 

• replacing the keys in QA Chips with specific keys for the application (i.e. INK OA 
and PRINTER^QA) 

• setting operating parameters as per the license with the OEM 

3.7.Z1 Replacing keys 

The ComCo is issued QID hardware [4] by QACo that allows programming of the various 
keys (except for K|) in a given QA Chip to the final values, following the standard 
CliipF/ChipP replace key (indirect version) protocol [5]. The indirect version of the proto- 
col allows each QACo^ComCo^Key to be different for each SoPEC. 

In the case of programming of PRINTER^QAs K, to be SoPEC Jdjkey, there is the addi- 
tional step of transferring an asynmietrically encrypted SoPECJdJcey (by the public-key) 
along with the nonce (Rp) used in the replace key protocol to the device that is functioning 
as a ChipF. The ChipF must decrypt the SoPEC Jdjcey so it can generate the standard 
replace key message for PRINTER^QA (functioning as a ChipP in the ChipF/ChipP pro- 
tocol). The asymmetric key pair held in the ChipF equivalent-should be unique to a 
ComCo (but still known only by QACo) to prevent damage in the case of a compromise. 

Note that the various keys installed in the QA Chips (both INK_QA and PRINTEICQA) 
are only known to the QACo. The OEM only uses QIDs and QACo siq)plied ChipFs. The 
replace key protocol [5] allows the programming to occur without compromising the old 
or new key. 

3. 7- 2. 2 Setting operating parameters 

There arc two sets of operating parameters stored in PRINTER_QA and INK_QA: 

• fixed 

• upgradable 

The fixed operating parameters can be written to by means of a non-authenticated writes 
[5] to Mi+ via a QID [4], and permission bits set such that they arc RcadOnly. 

The upgradable operating parameters can only be written to after the QA Chips have been 
programmed with the correct keys as per Section 3.7.2.1. Once they contain the correct 
keys they can be progranuned with appropriate operating parameters by means of a QID 
and an ^propriate ChipS (containing matching keys). 
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3 Introduction 

This document describes the SoPEC ASIC (Small office home office Print Engine Controller) suitable for 
use in price sensitive SoHo printer products. The SoPEC ASIC is intended to be a low cost solution for bi- 
lithic printhead control, replacing the multichip solutions in larger more professional systems with a single 
chip. The increased cost competitiveness is achieved by integrating several systems such as a modified 
FECI [1] printing pipeline, CPU control system, peripherals and memory sub-system onto one SoC ASIC, 
reducing component count and simplifying board design. 

This section will give a general introduction to Memjet printing systems, introduce the components that 
make a bi-Iithic printhead system, describe possible system architectures and show how several SoPECs 
can be used to achieve A3 and A4 duplex printing. The section "SoPEC ASIC" describes the SoC SoPEC 
ASIC, with subsections describing the CPU, DRAM and Print Engine Pipeline subsystems. Each section 
gives a detailed description of the blocks used and their operation within the overall print system. The final 
section describes the bi-lithic printhead construction and associated implications to the system due to its 
makeup. 

Some sections of this document were derived from the Print Engine Controller Hardware Design Specifi- 
cadon[l] written by Siiveibrook Research. 
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4 Nomenclature 



4.1 Bl-LJTHIC PrINTHEAO NOTATION 



4.2 



A bi-lithic based printhead is constructed from 2 printhead ICs of varying sizes. The notation M:N is used 
to express the size relationship of each IC, where M specifies one printhead IC in inches and N specifies 
the remaining printhead IC in inches. 

Section 35 Memjet Printhead contains a description of the bi-lithic printhead and related terminology. 



Definitions 

The following terms 
Bi-lithic printhead 
CPU 

ISI-Bridge chip 

ISIMaster 
ISISlave 
LEON 

LineSyncMaster 

MuIti-SoPEC 
Nctpage 
PECl 

Printhead IC 
PrintMaster 

Q A Chip 
Storage SoPEC 
Tag 



are used throughout this specification: 
Refers to printhead constructed from 2 prinftead ICs 
Refers to CPU core, caching system and MMU. 

A device with a high speed interface (such as USB2.0, Ethernet or IEEE1394) and 
one or more IS! interfaces. The ISI-Bridge would be the ISIMaster for each of the 
IS! buses it interfaces to. 

The ISIMaster is the only device allowed to initiate communication on the Inter 
Sopec Interface (ISI) bus. The ISIMaster interfaces directly with the host 

Multi-SoPEC systems will contain one or more ISISlave SoPECs connected to the 
ISI bus. ISISlaves can only respond to communication initiated by the ISIMaster. 
Refers to the LEON CPU core. 

The LineSyncMaster device generates the line synchronisation pulse that all 
SoPECs in the system must synchronise their line outputs to. 

Refers to SoPEC based print system with multiple SoPEC devices 

Refers to page printed with tags (normally in infixed ink). 

Refers to Print Engine Controller version 1, precursor to SoPEC used to control 
printheads constructed from multiple angled printhead segments. 

Single MEMS IC used to constmct bi-lithic printhead 

The PrintMaster device is responsible for coordinating all aspects of the print 
operation. There may only be one PrintMaster in a system. 

Quality Assurance Chip 

An ISISlave SoPEC used as a DRAM store and which does not print. 

Refers to pattern which encodes information about its position and orientation which 
allow it to be optically located and its data contents read. 



4,3 



Acronym and Abbreviations 

The following acronyms and abbreviations are used in this specification 
CPU Contone FIFO Unit 

CPU Central Processing Unit 

DIU DRAM Interface Unit 
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DNC 


Desui Nozzle Compensator 


DRAM 


Dynamic Random Access Memoiy 


DWU 


DotLine Writer Unit 


GPIO 


General Purpose Input Output 


HCU 


Malfloner Comt^ositor Unit 


ICU 


IntemiDt C!ontrf>ller Unit 


ISI 


Inter SoPEC Interface 


LDB 


Lossless Bi-Ievel Encoder 


LLU 


Line Loader Unit 


LSS 


wVVV O^w&U iJdloJ, lULCllilL'W 


MEMS 




MMU 


Memorv M^anaffement I In it 


PCU 


^oPPC r^nntrnl1«»r T Tnit 


PHI 


A luikiiwwa luicricicc 


PSS 


rower oavc oioragc Lrnil 


RDU 


iveaj-iune ueoug unit 


ROM 


Read Only Memory 


SCB 


Serial Communication Block 


SFU 


Spot FIFO Unit 


SMG4 


SilvcrbiDok Modified Group 4. 


SoPEC 


Small office home office Print Engine Controller 


SRAM 


Static Random Access Memoiy 


TE 


Tag Encoder 


TFU 


Tag FIFO Unit 


TIM 


Timers Unit 


USB 


Universal Serial Bus 



4.4 Pseudocode notation 

In general the pseudocode examples use C like statements with some exceptions. 
Symbol and naming convections used for pseudocode. 
// Conunent 
= Assignment 

=J=,<,> Operator equal, not equal, less than, greater than 

+.-,*/,% Operator addition, subtraction, multiply, divide, modulus 

&,1,'^,<<,»,- Bitwise AND, bitwise OR, bitwise exclusive OR, left shift, right shift, complement 

AND.OR.NOT Logical AND, Logical OR, Logical inversion 
[XX:YY] Array/vector specifier 
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{a, b» c} Concatenation operation 

~ Increment and decrement 

4.4.1 Register and signal naming conventions 

In general register naming uses the C style conventions with capitalization to denote word delimiters. Sig- 
nals use RTL style notation where underscore denote word delimiters. There is a direct translation between 
both convention. For example the CmdSourceFifo register is equivalent to cmd_source Jifo signal 



4.5 State MACHiNE notation 



State machines should be described using the pseudocode notation outlined above. State machine descrip- 
tions use the convention of mdsiliflfi to indicate the cause of a transition from one state to another and 
plain text (no underline) to indicate the effect of the transition i.e. signal transitions which occur when the 
new state is entered 

A sample state machine is shown in Figure 1. 



reset «.»O nnrt n«.n 
cdu_diu_froq - o 
ignora^data • 0 



I 



cxlu_<Jlu_rroq=i l ^ Reset ^ 



Ignore^data « 0 



OQ = Q /- \ 

"J Idle V 



00*^ 1 & 
done band «m ft 
cdu_dlu_rreq » O 
Ignore^data = 0 



Figure 1. Example State machine notation 



Doc: SoPEC_hardware_clesign 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 

Page 12 




SoPEC : Hardware Design 



5 Printing Considerations 

A bMithic printhead produces 1600 dpi bi-level dots. On low-diffusion paper, each ejected drop forms a 
22.5|jjn diameter dot. Dots are easily produced in isolation, allowing dispersed-dot dithering to be 
exploited to its ftillest. Since the bi-lithic printhead is the width of the page and operates with a constant 
paper velocity, color planes are printed in perfect registration, allowing ideal dot-on-dot printing. Etot-on- 
dot printing minimizes 'muddying' of midtones caused by inter-color bleed. 

A page layout may contain a mixture of images, graphics and text. Continuous^tone (contone) images and 
graphics are reproduced using a stochastic dispersed-dot dither. Unlike a clustered-dot (or amplitude-mod- 
ulated) dither, a dispersed-dot (or frequency-modulated) dither reproduces high spatial frequencies (i.e. 
image detail) abnost to the limits of the dot resolution, while simultaneously reproducing lower spatial fre- 
quencies to their full color depth, when spatially integrated by the eye, A stochastic dither matrix is care- 
fully designed to be free of objectionable low-frequency patterns when tiled across the image. As such its 
size typically exceeds the minimum size required to support a particular number of intensity levels (c.g 
1 6x1 6x 8 bits for 257 intensity levels). 

Human contrast sensitivity peaks at a spatial frequency of about 3 cycles per degree of visual field and 
then falls off logarithmically, decreasing by a factor of 100 beyond about 40 cycles per degree and becom- 
ing immeasurable beyond 60 cycles per degree [2 1 ] [22], At a normal viewing distance of 1 2 inches (about 
300mm). this translates roughly to 200-300 cycles per inch (cpi) on the printed page, or 400-600 samples 
per inch according to Nyqxiist's theorem. 

In practice, contone resolution above about 300 ppi is of limited utility outside special applications such as 
medical imaging. Offset printing of magazines, for example, uses contone resolutions in the range 150 to 
300 ppi. Higher resolutions contribute slightly to color error dirough the dither. 

Black text and graphics are reproduced directly using bi-lcvel black dots, and are therefore not anti-aliased 
(i.e. low-pass filtered) before being printed. Text should therefore be supersampled beyond the perceptual 
limits discussed above, to produce smoother edges when spatially integrated by the eye. Text resolution up 
to about 1200 dpi continues to contribute to perceived text sharpness (assuming low-difRision ps^r, of 
course). 

A Netpage printer, for example, may use a contone resolution of 267 ppi (i.e. 1600 dpi / 6), and a black 
text and graphics resolution of 800 dpi. A high end office or departmental printer may use a contone reso- 
lution of 320 ppi (1600 dpi / 5) and a black text and graphics resolution of 1600 dpi. Both formats are 
capable of exceeding the quality of commercial (offset) printing and photographic reproduction. 
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6 Document Data Flow 



6.1 CONSfDERATIONS 

Because of the page-width nature of the bi-lithic printhead, each page must be printed at a constant speed 
to avoid creating visible artifacts. This means that the printing speed can't be varied to match the input 
data rate. Document rasterization and document printing are therefore decoiq)led to ensure the printhead 
has a constant supply of data. A page is never printed until it is fully rasterized. This can be achieved by 
storing a compressed version of each rasterized page image in memory. 

This decoupling also allows the RIP(s) to run ahead of the printer when rasterizing simple pages, buying 
time to rasterize more complex pages. 

Because contone color images are reproduced by stochastic dithering, but black text and line graphics are 
reproduced directly using dots» the compressed page image format contains a separate foreground bi-level 
black layer and background contone color layen The black layer is con^osited over the contone layer after 
the contone layer is dithered (although the contone layer has an optional black component). A final layer 
of Netpage tags (in infrared or black ink) is optionally added to the page for printout. 

Figure 2 shows the flow of a document from computer system to printed page. 




Figure 2. Document data flow 
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At 267 ppi for example, a A4 page (8.26 inches x 1 1.7 inches) of contone CMYK data has a size of 
263MB. At 320 ppi, an A4 page of contone data has a size of 37.8MB. Using lossy contone compression 
algorithms such as JPEG [23], contone images compress with a ratio up to 10:1 without noticeable loss of 
quality, giving compressed page sizes of 2.63MB at 267 ppi and 3.78 MB at 320 ppi. 

At 800 dpi, a A4 page of bi-level data has a size of 7.4MB. At 1600 dpi, a Letter page of bi-level data has 
a size of 29.5 MB. Coherent data such as text compresses very well. Using lossless bi-level compression 
algorithms such as SMG4 fax as discussed in Section 8.1.2.3.1, ten-point plain text compresses with a 
ratio of about 50:1. Lossless bi-level compression across an average page is about 20:1 with 10:1 possible 
for pages which compress poorly. The requirement for SoPEC is to be able to print text at 10:1 compres- 
sion. Assuming 10:1 compression gives compressed page sizes of 0.74 MB at 800 dpi, and 2.95 MB at 
1600 dpi. 

Once dithered, a page of CMYK contone image data consists of 1 16MB of bi-level data. Using lossless bi- 
level compression algorithms on this data is pointless precisely because the optimal dither is stochastic - 
i.e. since it introduces haid-to-compress disorder. 

Netpage tag data is optionally supplied with the page image. Rather than storing a compressed bi-level 
data layer for the Netpage tags, the tag data is stored in its raw form. Each tag is supplied to 1 20 bits of 
raw variable data (combined with up to 56 bits of raw fixed data) and covers up to a 6mm x 6mm area (at 
1600 dpi). The absolute maximum number of tags on a A4 page is 15,540 when the tag is only 2mm x 
2nmi (each tag is 126 dots x 126 dots, for a total coverage of 148 tags x 105 tags). 15.540 tags of 128 bits 
per tag gives a compressed tag page size of 0.24 MB. 

The multi-layer compressed page image format therefore exploits the relative strengths of lossy JPEG con- 
tone image compression, lossless bi-level text compression, and tag encoding. The format is compact 
enough to be storage-efficient, and simple enough to allow straightforward real-time expansion during 
printing. 

Since text and images normally don't overiap, the normal worst-case page image size is image only, while 
the normal best-case page image size is text only. The addition of worst case Netpage tags adds 0.24MB to 
the page image size. The worst-case page image size is text over image plus tags. The average page size 
assumes a quarter of an average page contains images. Table 1 shows data sizes for compressed Letter 
page for these different options. 



Table 1. Data sizes for A4 page (8,26 inches x 1 1.7 Inches) 









Image only (contone). 10:1 compression 


2.63 MB 


3.78 MB 


Text only (bi-level). 10:1 compression 


0.74 MB 


2.95 MB 


Netpage tags, 1600 dpi 


0.24 MB 


0.24 MB 


Worst case (text -i- image -i- tags) 


3.61 MB 


6.67 MB 


Average (text + 25% image + tags) 


1.64 MB 


4.25 MB 



6.2 Document Data Flow 

The Host PC rasterizes and compresses the incoming document on a page by page basis. The page is 
restructured into bands with one or more bands used to construct a page. The compressed data is then 
transferred to the SoPEC device via the USB link, A complete band is stored in SoPEC embedded mem- 
ory. Once the band transfer is complete the SoPEC device reads the compressed data, expands the band, 
normalizes contone, bi-lcvel and tag data to 1600 dpi and transfers the resultant calculated dots to the bi- 
lithtc printhead. 

The document data flow is 
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• The RIP software rasterizes each page description and compress the rasterized page image. 

• The infrared layer of the printed page optionally contains encoded Ne^age [5] tags at a programmable 
density. 

• The compressed page image is tiansferred to the SoPEC device via the USB normally on a band by 
band basis. 

• The print engine takes the compressed page image and starts the page expansion. 

• The first stage page expansion consists of 3 operations performed in parallel 

• expansion of the JPEG-compressed contone layer 

I • expansion of the SMG4 fax compressed bi-level layer 

• encoding and rendering of the bi-level tag data 

• The second stage dithers the contone layer using a programmable dither matrix, producing up to four 
bi-level layers at full-resolution. 

I • The second stage then composites the bi-level tag data layer, the bi-level SMG4 fax de-compressed 

layer and up to four bi-level JPEG de-compressed layers into the full-resolution page image. 

• A fixative layer is also generated as required. 

• The last stage formats and prints the bi-level data through the bi-lithic printhead via Ac printhead inter- 
face. 

The SoPEC device can print a full resolution page with 6 color planes. Each of the color planes can be 
I generated from compressed data through any channel (either JPEG compressed, bi-level SMG4 fex com- 

pressed, tag data generated, or fixative channel created) with a maximum number of 6 data channels from 
page RJP to bi-lithic printhead color planes. 

The mapping of data channels to color planes is programmable, this allows for multiple color planes in the 
printhead to map to the same data channel to provide for redundancy in the printhead to assist dead nozzle 
compensation. 

Also a data channel could be used to gate data from another data channel. For example in stencil mode, 
I data from the bilevel data channel at 1600 dpi can be used to filter the contone data channel at 320 dpi, giv- 

ing the effect of 1600 c^i contone image. 

6.3 Page considerations due to SoPEC 

The SoPEC device typically stores a complete page of document data on chip. The amount of storage 
available for compressed pages is limited to 2Mbytes, imposing a fixed maximum on compressed page 
size. A comparison of the compressed image sizes in Table 1 indicates that SoPEC would not be CBpable 
I of printing worst case pages unless they are split into bands and printing commences before all the bands 

for the page have been downloaded. The page sizes in the table axe shown for comparison purposes and 
would be considered reasonable for a professioiial level printing system. The SoPEC device is aimed at the 
consumer level and would not be required to print pages of that complexity. Target document types for the 
SoPEC device are shown Table 2. 



Table 2. Page content targets for SoPEC 















Best Case picture Image. 267ppi with 3 colors, A4 size 


8.26x1 1 .7x267x267x3 9 1 0:1 


1.97 


FutI page text. 600dpi A4 size 


8.26x11.7x800x800 e 10:1 


0.74 
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Table 2« Page content targets for SoPEC 




Mixed G raph ics and Text 

- Image of 6 inches x 4 inches Q 267 ppi and 3 colors 

- Remaining area text -73 inches^, 800 dpJ 



6x4x267x267x3 O 5:1 l .55 

800x800x73 d 10:1 



Best Case Photo, 3 Corors. 6.6 Megapixel Imago 



6.6 Mpixel O 10:1 



2.00 



If a document with more complex pages is required, the page RIP software in the host PC can detennine 
that there is insufficient memory storage in SoPEC for that document In such cases the RIP software 
can take two courses of action. It can increase the compression ratio until the compressed page size will fit 
in the SoPEC device, at the expense of document quality, or divide the page into bands and allow SoPEC 
to begin printing a page band before all bands for that page are downloaded. Once SoPEC starts printing a 
page it cannot stop, if SoPEC consumes compressed data faster than the bands can be downloaded a buffer 
underrim error could occur causing the print to fail. A buffer underrun occurs if line synchronisation pulse 
is received before a line of data has been transferred to the printhead 

Other options which can be considered if the page does not fit completely into the compressed page store 
are to slow the printing or to use multiple SoPECs to print parts of the page. A Storage SoPEC (Section 
7.2.5) could be added to the system to provide guaranteed bandwidth data delivery. The print system could 
also be constructed using an ISI-Bridge chip (Section 7.2.6) to provide guaranteed data delivery. 
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7 Memjet Printer Architecture 

The SoPEC device can be used in several printer configurations and architectuies. 

In the general sense every SoPEC based printer architecture will contain: 

• One or more SoPEC devices. 

• One or more bi-lithic printheads. 

• Two or more LSS busses. 

• Two or more QA chips. 

• USB 1.1 connection to host or ISI connection to Bridge Chip. 
I • ISI bus connection between SoPECs (when multiple SoPECs are used). 

Some example printer configurations as outlined in Section 7.2. The various system components are out- 
lined briefly in Section 7. 1 . 

7.1 System Components 

7.1 . 1 SoPEC Print Engine Controller 

The SoPEC device contains several system on a chip (SoC) components, as well as the print engine pipe- 
line control application specific logic. 

7. 1. 1. 1 Print Engine Pipeline (PEP) Logic 

The PEP reads compressed page store data from the embedded memory, optionally decompresses the data 
and formats it for sending to the printhead. The print engine pipeline functionality includes expanding the 
page image, dithering the contone layer, compositing the black layer over the contone layer, rendering of 
Netpage tags, compensation for dead nozzles in the printhcad, and sending the resultant image to the bi- 
lithic printhead. 

7. f . i. 2 Embedded CPU 

SoPEC contains an embedded CPU for general purpose system configuration and management. The CPU 
performs page and band header processing, motor control and sensor monitoring (via the GPIO) and other 
system control functions. The CPU can perform buffer management or report buffer status to the host. The 
CPU can optionally run vendor application specific code for general print control such as paper ready 
monitoring and LED status update. 

7. f . f . 3 Embedded Memory Buffer 

A 2.5Mbyte embedded memory buffer is integrated onto the SoPEC device, of which q)proximately 
2Mbyte$ are available for compressed page store data. A compressed page is divided into one or more 
bands, with a number of bands stored in memory. As a band of the page is consumed by the PEP for print- 
ing a new band can be downloaded The new band may be for the current page or the next page. 

Using banding it is possible to begin printing a page before the complete compressed page is downloaded, 
but care must be taken to ensure that data is always available for printing or a buffer underrun may occur. 

An Storage SoPEC acting as a memory buffer (Section 7.2.5) or an ISI-Bridge chip with attached DRAM 
(Section 7.2.6) could be used to provide guaranteed data delivery. 
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7. f . i,4 Embedded USB 1. 1 Device 



The embedded USB LI device accepts compressed page data and control commands from the host PC, 
and facilitates the data transfer to either embedded memory or to another SoPEC device in multi-SoPEC 
systems. 



7.1.2 Bl-lithtc Printhead 



The printhead is constructed by abutting 2 printhead ICs together. The printhead ICs can vary in size from 
2 inches to 8 inches, so to produce an A4 printhead several combinations are possible. For example two 
printhead ICs of 7 inches and 3 inches could be used to create a A4 printhead (the notation is 7:3). Simi- 
larly 6 and 4 combination (6:4), or 5:5 combination. For an A3 printhead it can be constructed from 8:6 or 
an 7:7 printhead IC combination. For photographic printing smaller printheads can be constructed. 



7.1 .3 LSS i nterface bus 



Each SoPEC device has 2 LSS system buses for communication with QA devices for system authentica- 
tion and ink usage accounting. The number of Q A devices per bus and their position in the system is unre- 
stricted with the exception that PRINTER_QA and INKjQA devices should be on separate LSS busses. 



7.1.4 QAde^nces 



Each SoPEC system can have several QA devices. Normally each printing SoPEC will have an associated 
PRINTER^QA. Ink cartridges wiU contain an INK_QA chip. PRINTER _QA and INKjQA devices should 
be on separate LSS busses. All QA chips in the system are physically identical with flash memory contents 
dfi^mn% PRINTER JQA Gcom INKjQA chip. 

7.1.5 ISI interface 

The Inter-SoPEC Tnterface (ISI) provides a communication channel between SoPECs in a multi-SoPEC 
system. The ISIMaster can be SoPEC device or an ISI-Bridge chip depending on the printer configuration. 
Both compressed data and control commands arc transferred via the interface. 

7.1.6 ISI-Bridge Chip 

A device, other than a SoPEC with a USB connection, which provides print data to a number of slave 
SoPECs. A bridge chip will typically have a Wgh bandwidth connection, such as USB2.0. Ethernet or 
IEEE 1394, to a host and may have an attached external DRAM for compressed page storage. A bridge 
chip would have one or more ISI interfaces. The use of multiple ISI buses would allow the construction of 
independent print systems within the one printer. The ISI-Bridge would be the ISIMaster for each of the 
ISI buses it interfaces to. 

7.2 Possible SoPEC Systems 

Several possible SoPEC based system architectures exist. The following sections outline some possible 
architectures. It is possible to have extra SoPEC devices in the system used for DRAM storage. The QA 
chip configurations shown are indicative of the flexibility of LSS bus architecture, but not limited to those 
configurations. 
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7.2.1 A4 Simplex with 1 SoPEC device 



USB from Host { 




high speed 
'4r^ tow speed 



( prfnthead assembly 

*■ * — -.J 

Figure 3. Single SoPEC A4 Simplex system 



In Figure 3, a single SoPEC device can be used to control two printhead ICs. The SoPEC receives com- 
pressed data through the USB device from the host. The compressed data is processed and transferred to 
the printhead. 



7.2.2 A4 Duplex with 2 SoPEC devices 



USB from Host 




highspeed 
low speed 



Figure 4. Dual SoPEC A4 Duplex system 

In Figure 4. two SoPEC devices are used to control two bi-lithic printheads, each with two printhead ICs. 
Each bi-Iithic printhead prints to opposite sides of the same page to achieve duplex printing. The SoPEC 
connected to the USB is the ISIMastcr SoPEC, the remaining SoPEC is an ISISlave. The ISIMaster 
receives all the compressed page data for both SoPECs and re-distributes the compressed data over the 
Inter-SoPEC Interface (IS I) bus. 
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Ft may not be possible to print an A4 page every 2 seconds in this configiiration since the USB 1.1 connec- 
tion to the host may not have enough bandwidth. An alternative would be for each SoPEC to have its own 
USB 1 . 1 connection.This would allow a faster average print speed. 




In Figure 5, two SoPEC devices are used to control one A3 bi-lithic printhcad. Each SoPEC controls only 
one printhead IC (the remaining PHI port typically remains idle). The USB 1.1 connection defines the ISI- 
Master SoPEC. In this dual SoPEC configuration the compressed page store data is split across 2 SoPECs 
giving a total of 4Mbyte page store, this allows the system to use compression rates as in an A4 architec- 
ture, but with the increased page size of A3. The ISIMaster receives all the compressed page data for all 
SoPECs and re-distributes the compressed data over the Inter-SoPEC Interface (ISI) bus. 

It may not be possible to print an A3 page every 2 seconds in this configuration since the USB 1.1 connec- 
tion to the host will only have enough bandwidth to supply 2Mbytes every 2 seconds. Pages which require 
more than 2MBytes every 2 seconds will therefore print slower. An alternative would be for each SoPEC 
to have its own USB 1.1 connection. This would allow a faster average print speed. 
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7.2.4 A3 Duplex with 4 SoPEC devices 



t roplac^ablt ]i replaceable* 1 
I InkcaitiWfle ^ Ink cartridge , 




Figure 6. Quad SoPEC A3 duplex system 



In Figure 6 a 4 SoPEC system is shown. It contains 2 A3 bi-lithic printhcads, one for each side of an A3 
page. Each printhead contain 2 printhead ICs. each piinthead IC is controlled by an independent SoPEC 
device, with the remaining PHI port typically unused. Again the USB 1 . 1 connection defines the ISIMaster 
with the other SoPECs as ISISIaves. In total, the system contains SMbytes of compressed page store 
(2Mbytes per SoPEC), so the increased page size does not degrade the system print quality, from that of an 
A4 simplex printer. The ISIMaster receives all the compressed page data for all SoPECs and re-distributes 
ttie con^>ressed data over the Inter-SoPEC Interfece 0SI) bus. 

It may not be possible to print an A3 page every 2 seconds in this configuration since the USB 1.1 connec- 
tion to the host will only have enough bandwidth to supply 2Mbytes every 2 seconds. Pages which require 
more than 2MBytes every 2 seconds will therefore print slower. An alternative would be for each SoPEC 
or set of SoPECS on the same side of the page to have their own USB 1.1 connection. This would allow a 
faster average print speed. 
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7.2.5 SoPEC ORAM storage solution: A4 Simplex with 1 printing SoPEC and 1 memory SoPEC 



USB from Host { 




SoPEC I SoPEC gsed 
Device #1 I ®^ ^^^^ storage 



I printhead assembly 
J 

Figure 7. SoPEC A4 Simplex system with extra SoPEC used as DRAM storage 



highspeed 
0> lowsiieed 



Extra SoPECs can be used for DRAM storage e.g. in Figure 7 an A4 simplex printer can be built with a 
single extra SoPEC used for DRAM storage. The DRAM SoPEC can provide guaranteed bandwidth deliv- 
ery of data to the printing SoPEC. SoPEC configurations can have multiple extra SoPECs used for DRAM 
storage. 
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7.2.6 ISI-Bridge chip solution: A3 Duplex system with 4 SoPEC devices 



I replaceable 
I Inkcaitrldee 




ink cartrklge 
QAchip 



A3Memiet 
printhead 



^ high speed 
low speed 



prfnthead assembly 




Figure 8. A3 duplex system featuring four printing SoPECs 



In Figure 8, an ISI-Bridge chip provides slave-only ISI connections to SoPEC devices. Figure 8 shows a 
ISI'Bridge chip with 2 separate ISI ports. The ISI-Bridge chip is the ISIMaster on each of the ISI busses it 
is connected to. All connected SoPECs are ISISlaves. The ISI-Bridge chip will typically have a high band- 
width connection to a host and may have an attached external DRAM for compressed page storage. 

An alternative to having a ISI-Bridge chip would be for each SoPEC or each set of SoPECs on the same 
side of a page to have their own USB 1.1 connection. This would allow a faster average print speed. 
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8 Page Format and Printflow 



When rendenog a page, the RIP produces a page header and a number of bands (a non-blank page requires 
at least one band) for a page. The page header contains high level i«idering parameters, and each band 
contains compressed page data. The size of the band will depend on the memory available to the RIP the 
speed of the RIP. and the amount of memoiy remaining in SoPEC while printing the previous band(s). 'Fig- 
ure 9 shows the hig^ level data structure of a number of pages with diflFerent numbers of bands in the page 



blank page 



Stngre band page 



2 band page 



mufti band page 



page header 



page header 



banrfO 



page header 



bandO 



pageheadar 



band 0 



bandl 



bandn 



Figure 9. Pages containing different numbers of bands 

Each compressed band contains a mandatory band header, an optional bi-level plane, optional sets of inter- 
leaved contone planes, and an optional tag data plane (for Netpage enabled applications). Since each of 
these planes is optional', the band header specifies which planes are included with the band. Figure 10 
gives a high-level breakdown of the contents of a page band. 



band n 




band header 



contorw plane 



tag data plana 



Figure 10. Contents of a page band 

A single SoPEC has maximum rendering restrictions as follows: 

• I bi-level plane 

• 1 contone interleaved plane set containing a maximum of 4 contone planes 

• 1 tag data plane 

• a bi-lithic printhead with a maximum of 2 printhead ICs 
The requirement for single-sided A4 single SoPEC printing is 

• average contone JPEG compression ratio of 10:1, with a local minimum compression ratio of 5 \ for : 
smgle line of interleaved JPEG blocks. 



1 . Although a band must contain at least one plane 
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• average bi-level compression ratio of 10:1, with a local minimum compression ratio of 1:1 for a single 
line. 

If the page contains rendering parameters that exceed these specifications, then the RIP or the Host PC 
must split the page into a format that can be handled by a single SoPEC. 

In the general case, the SoPEC CPU must analyze the page and band headers and generate an appropriate 
set of register write commands to configure the units in SoPEC for that page. The various bands are passed 
to the destination SoPEC(s) to locations in DRAM determined by the host. 

The host keeps a memory map for the DRAM, and ensures that as a band is passed to a SoPEC, it is stored 
in a suitable free area in DRAM. Each SoPEC is connected to the ISI bus or USB bus via its Serial com- 
munication Block (SCB). The SoPEC CPU configures the SCB to allow compressed data bands to pass 
from the USB or ISI through the SCB to SoPEC DRAM. Figure 1 1 shows an example data flow for a page 
destined to be printed by a single SoPEC. Band usage information is generated by the individual SoPECs 
and passed back to the host. 



Host RIP 



page/band header 



bi-tevel plane 



oontone Interteaved 
plane 



tag data plane 



SCB 

r 1 

passed throutih 



SoPEC'fl ORAM 



passed through. 



passed through 
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i- ^ ^ -I 



page/band header 



bl-leve} plane 




contone interleaved 
plane 



tag data plane 



register commands ^ 



CPU 



SoPEC^ Registers 



Figure 11. Page data path from host to SoPEC 

SoPEC has an addressing mechanism that permits circular band memory allocation, thus facilitating easy 
memory management. However it is not strictly necessary that all bands be stored together. As long as the 
appropriate registers in SoPEC are set up for each band, and a given band is contiguous ^ the memory can 
be allocated in any way. 



1. Contiguous allocation also includes wrapping around in SoPEC*s band store 



memory. 
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8.1 Print engine example page format 

This section describes a possible format of compressed pages expected by the embedded CPU in SoPEC 
The fonnat is generated by software in the host PC and inteipreted by embedded software in SoPEC. This 
section indicates the type of information in a page fonnat structure, but implementations need not be lim- 
ited to this fonnat The host PC can optionally perform the majority of the header processing. 

The compressed format and the print engines are designed to allow real-time page expansion during print- 
ing, to ensure that printing is never interrupted in the middle of a page due to data underrun. 

The page format described here is for a single black bi^level layer, a contone layer, and a Netpage tag 
layer. The black bi-level layer is defined to composite over the contone layer. 

The black bi-level layer consists of a bitmap containing a 1-bit opacity for each pixel. This black layer 
matte has a resolution which is an integer or non-integer factor of the printer^s dot resolution. The highest 
supported resolution is 1600 dpi, i.e. the printer's full dot resolution. 

The contone layer, optionally passed in as YCrCb, consists of a 24-bit CMY or 32-bit CMYK color for 
each pixel. This contone image has a resolution which is an integer or non-integer factor of the printer*s 
dot resolution. The requirement for a single SoPEC is to support 1 side per 2 seconds A4/Letter printing at 
a resolution of 267 ppi, i.e. one-sixth the printer's dot resolution. 

Non-integer scaling can be performed on both the contone and bi-level images. Only integer scaling can be 
performed on the tag data. 

The black bi-level layer and the contone layer are both in compressed form for efficient storage in the 
printer's internal memory. 



8.1 .1 Page structure 



A single SoPEC is able to print with full edge bleed for Letter and A3 via different stitch part combina- 
tions of the bi-lithic printhead. It imposes no margins and so has a printable page area which corresponds 
to the size of its paper The target page size is constrained by the printable page area, less the explicit (tar- 
get) left and top margins specified in the page description. These relationships are illustrated below. 



taiget top margin 



targst bottom morgan 



' target page 

' printable page aj 
(phy»cal page) 



Figure 12. Page structure 



Doc: SoPEC_hardware_clesign 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 27 



SoPEC : Hardware Design 



S5 



8.1.2 Compressed page format 

Apart from being implicitly defined in relation to the printable page area, each page description is com- 
plete and self-contained. There is no data stored separately from the page description to which the page 
description refers. The page description consists of a page header which describes the size and resolution 
of the page, foUowed by one or more page bands which describe the actual page content 

6,1,2.1 Page header 

Table 3 shows an example format of a page header. 

Table 3. Page header format 







signature 


16-bit integer 


Page header format signature. 


version 


16-bit integer 


Page header format version number. 


structure size 


iS-bit integer 


Size of page header. 


band count 


164)it integer 


Number of bands specified for this page. 


target resolution (dpi) 


16-blt Integer 


Resolution of target page. This is always 1600 tor the Memjet 
printer. 


target page width 


16*bft integer 


vviuui oi idiyei page, in oois. 


target page height 


32^)11 Integer 


Height of target page, in dots. 


target left margin for black and 
contone 


16-bit Integer 


Wkfth of target left margin, in dots, for black and contone. 


target top margin for black and 
contone 


16-bit integer 


Height of target top margin, In dots, for black and contone. 


target rfght margin for black and 
contone 


1 6-bit integer 


Wkfth of target right margin, in dots, for btack and contone. 


target bottom margin for black 
and contone 


1 6-bit integer 


Height of target bottom margin, in dots, for black and contone. 


target left margin for tags 


16-bit integer 


WWth of target left margin, in dots, for tags. 


target top margin for tags 


16-blt integer 


Height of target top margin, in dots, for tags. 


target rigfit margin for tags 


16-blt integer 


Wkfth of target right margin, in dots, for tags. 


target-bottom margin for tags 


1 6-bit integer 


Height of target tx>ttom margin, in dots, for tags. 


generate tags 


16-bit integer 


Specifies whether to generate tags for this page (0 - no, 1 - 
yes). 


fixed tag data 


128-bit integer 


This is only valid if generate tags is set 


tag vertical scale factor 


16-bit integer 


Scale factor in vertical direction from tag data resolution to tar- 
get resolutfon. Valkl range =1-511. Integer scaling only 


tag horizontal scale factor 


16-bit integer 


Scale factor in horizontal direction from tag data resolution to 
target resolutfon. ValW range = 1-51 1. Integer scaling only. 


bi-levei layer vertical scale factor 


16-bit integer 


Scale factor in vertfoal direction from bi-level resolution to tar- 
get resolution (must be 1 or greater). May be non-integer. 
Expressed as a fraction with upper 8-bits the numerator and 
the tower 8 bits the denominator. 



^ ' fo™^ 7t^^^^f!'!'^' T'^T stn^ctures to have already been set up. but these arc not considered to be part of a general page 

foimat. It IS invial to extend the page format to allow exact specification of dither matrices and tag stnictures. 
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Table 3. Page header format 









bi-l6vel layer horizontal scale fac- 
tor 


16-bit Integer 


Scale tactor in horizontal direction from bNevel resolution to 
target resolution (must be 1 or greater). May be non-integer. 
Expressed as a fraction with upper 6-bits the numerator and 
the lower 8 bits the denominator. 


bMevel layer page width 


16-bit integer 


Width of biHevel layer page, In pixels. 


bMevel layer page height 


32-bit integer 


Height of bHevel layer page, in pixels. 


contone flags 


.16 bH Integer 


Defines the color conversion that is required for the JPEG 
data. 

Bits 2-0 specify hew many oontone planes there are (e.g. 3 for 
CMY and 4 tor CMYK). 

Bft 3 specifies whether the first 3 color planes need to be con- 
verted back from YCrCb to CMY. Only valid If b2-0 s 3 or 4. 

0 - no conversion, leave JPEG colors alone 

1 - color convert 

Bits 7-4 specifies whether the YCrCb was generated <firectly 
from CMY, or whether It was converted to RGB first via the 
step: R = 255-C, G = 255-M, B = 255-Y, Each of the color 
planes can be individually inverted. 
Bit 4: 

0 - do not invert color plane 0 

1 ' invert color plane 0 
Bit 5: 

0 - do not invert color plane 1 

1 - Invert color plane 1 
Bit 6: 

0 - do not Invert color plane 2 

1 - invert odor plane 2 
Bit 7: 

0 - do not invert color plane 3 

1 - invert color plane 3 

Bit 8 specifies whether the contone data is JPEG compressed 

or non-compressed: 

0 " JPEG compressed 

1 - rK>n-compressed 

The remaining bits are reserved (0). 


contone vertical scale tactor 


16-btt Integer 


Scale factor in vertical direction from contone channel resolu- 
tion to target resolution. Valid range = 1-255. May be non-inte- 
ger. 

Expressed as a fraction with upper 8-bits the numerator and 
the lower 6 bits the denominator. 


oontone horizontal scale factor 


16-bit integer 


Scale factor in horizontal direction from contone channel reso- 
lution to target resolution. Valid range ^ 1-25S. May te non- 
integer. 

Expressed as a fraction with upper 8-bits the numerator and 
the lower 8 bits the denominator. 


contone page width 


16-bit integer 


Width of oontone page. In contone pixels. 


contone page height 


32-bit Integer 


Height of contone page, in contone pixels. 


reserved 


up to 128 
bytes 


Reserved and 0 pads out page header to multiple of 1 28 
bytes. 



The page header contains a signature and version which allov/ the CPU to identify the page header format. 
If the signature and/or version are missing or incompatible with the CPU, then the CPU can reject the 
page. 
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The contone flags define how many contone layers are present, which typically is used for defining 
whether the contone layer is CMY or CMYK. Additionally, if the color planes are CMY, they can be 
optionally stored as YCrCb, and further optionally color space converted from CMY directly or via RGB. 
Finally the contone data is specified as being either JPEG compressed or non-compressed 

The page header defines the resolution and size of the target page. The bi-level and contone layers are 
clipped to the target page if necessary. This happens whenever the bi-Ievel or contone scale factors are not 
factors of the target page width or height. 

The target left, top, right and bottom margins define the positioning of the target page within the printable 
page area. 

The tag parameters specify whether or not Netpage tags should be produced for this page and what orien- 
tation the tags should be produced at (landsc^ or portrait mode). The fixed tag data is also provided. 

The contone, bi-level and tag layer parameters define the page size and the scale factors. 

B.I. 2.2 Band format 

Table 4 shows the format of the page band header. 

Table 4. Band header format 



signature . 


le-bit Integer 


Page band header format signature. 


version 


16-bft integer 


Page t)and header fonrvat version number. 


structure size 


164>]t Integer 


Size of page band header. 


bi-level layer band height 


16-t3il integer 


HeigtTt of bi-ievel layer band, in black pixels. 


bi-<evel layer band data size 


32-bl1 integer 


Size of bi-ievel layer band data, in bytes. 


contone band height 


16-bi1 integer 


Height of contone band, in contone pixels. 


contone band data size 


3243it integer 


Size of contone plane band data, in bytes. 


tag band height 


16-bit integer 


Height of tag t)and, in dots. 


tag band data size 


32-bit integer 


Size of unenooded tag data t>and. in bytes. 
Can be 0 which Indicates that no tag data Is 

provided. 


reserved 


up to 128 
bytes 


Reserved and 0 pads out band header to 
nujltipfe of 128 bytes. 



The bi-level layer parameters define the height of the black band, and the size of its compressed band data. 
The variable-size black data follows the page band header. 

The contone layer parameters define the height of the contone band, and the size of its compressed page 
data. The variable-size contone data follows the black data. 

The tag band data is the set of variable tag data half-lines as required by the tag encoder. The format of the 
tag data is found in Section 26.5.2. The tag band data follows the contone data. 

Table 5 shows the format of the variable-size compressed band data which follows the page band header. 



Table 5. Page band data format 









blade data 


Modified G4 facsimile bitstream^ 


Compressed bi-level layer. 


contone data 


JPEG bytestream 


Compressed contone datalayer. 


tag data map 


Tag data array 


Tag data format See Section 26.5.2. 



Ooc: SoPEC_hardwar©_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 30 



SoPEC : Hardware Design 



1^ See section 8.1 .2.3 on page 31 for note regarding the use of this standard 

The start of each variable-size segment of band data should be aligned to a 256-bit DRAM word boundary. 

The following sections describe the format of the compressed bi-level layers and the compressed contone 
layer, section 26.5.1 on page 365 describes the format of the tag data structures. 



8. f . 2.3 BNevel data compression 



The (typically 1600 dpi) black bi-level layer is losslessly compressed using Silvetbrook Modified Group 4 
(SMG4) compression which is a version of Group 4 Facsimile compression [18] without Huffinan and 
with simplified run length encodings. Typically compression ratios exceed 10: 1. The encoding are listed in 
Table 6 and Table 7 

Table 6. Bi-Level group 4 facsimile style compression encodings 



mm 








1000 


Pass Command: oO <- 12, skip next two edges 


1 


V©rtica[(0): aO <- b1 , color = Icolor 


1 


110 


Vertical(l): aO ^ b1 +1, c»lor = !color 


010 


Vertical(-I): aO b1 - 1 , colof ss [color 


E p 


110000 


Vertical(2): aO b1 •i- 2. oolor = Icolor 


So 


010000 


Vertical(*2): aO «- b1 - 2, color » loolor 


nation 


100000 


Vertical(3): aO <— b1 +3, color = (color 


000000 


Verttcal(-3): aO b1 - 3. oolor = loolor 




<RLxRL>100 


Horizontal: aO ^ aO 4- <RL> + <RL> 


It 







SMG4 has a pass through mode to cope with local negative compression. Pass through mode is activated 
by a special lun-length code. Pass through mode continues to either end of line or for a pre-programmed 
number of bits, whichever is shorter. The special run-length code is always executed as a run-length code, 
followed by pass through. The pass through escape code is a medium length run-length with a run of less 
than or equal to 31. 

Table 7. Run length (RL) encodings 







RRRRR1 


Short Black Runlength (5 bits) 


RRRRR1 


Short White Runlength (5 bits) 


RRRRRRRRRR10 


Medium Black Runlength (10 brts) 


RRRRRRRR10 


Medium White Runlength (8 bits) 


RRRRRRRRRR10 


Medium Black Runlength with RRRRRRRRRR <= 31. 
Enter pass through 


RRRRRRRRIO 


Medium White Runlength with RRRRRRRR <= 31 , 
Enter pass through 


RRRRRRRRRRRRRRROO 


Long Black Runlength (15 bits) 


RRRRRRRRRRRRRRROO 


Long White Runlength (15 bits) 
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Since the compression is a bitstream, the encodings are read right (least significant bit) to left (most signif- 
icant bit). The run lengths given as RRRR in Table 7 are read in the same way (least significant bit at the 
right to niost significant bit at the left). 

Each band of bi-level data is optionally self contained. The first line of each band therefore is based on a 
'previous* blank line or the last line of the previous band. 

• 8.1.2.3.1 Group 3 and 4 facsimile compression 

The Group 3 Facsimile compression algorithm [18] losslessly compresses bi-level data for transmission ? 
over slow and noisy telephone lines. The bi-level data represents scanned black text and graphics on a 
white background, and the algorithm is tuned for this class of images (it is explicitly not tuned, for exam- 
ple, for halftoned bi-level images). The ID Group 3 algorithm runlength-encodes each scanline and then 
HufTinan-encodes the resulting runlengths, Runlengdis in the range 0 to 63 are coded with terminating 
codes. Runlengths in the range 64 to 2623 are coded with make-up codes, each representing a multiple of 
64> followed by a terminating code. Runlengths exceeding 2623 are coded with multiple make-up codes 
followed by a terminating code. The Huffman tables are fixed, but are separately tuned for black and white 
runs (except for make-up codes above 1728, which are common). When possible, the 2D Group 3 algo- 
rithm encodes a scanline as a set of short edge deltas (0, ±1 , ±2, ±3) with reference to the previous scan- 
line. The delta symbols are entropy-encoded (so that the zero delta symbol is only one bit long etc.) Edges 
within a 2D-encoded line which can't be delta-encoded are nmlength-encoded, and are identified by a pre- 
fix. ID- and 2D-encoded lines are marked differently. ID-encoded lines are generated at regular intervals, 
whether actually required or not, to ensure that the decoder can recover from line noise with minimal 
image degradation. 2D Group 3 achieves compression ratios of up to 6:1 [28]. 

The Group 4 Facsimile algorithm [18] losslessly compresses bi-level data for transmission ov^r emr-free 
communications lines (i.e. the lines are truly error-fiee, or error-correction is done at a lower protocol 
level). The Group 4 algorithm is based on the 2D Group 3 algorithm, with the essential modification that 
since transmission is assumed to be error- free, ID-encoded lines are no longer generated at regular inter- 
vals as an aid to error-recovery. Group 4 achieves compression ratios ranging from 20:1 to 60:1 for the 
CCnr set of test images [28]. 

The design goals and performance of the Group 4 compression algorithm qualify it as a compression algo- 
rithm for the bi-level layers. However, its Huffman tables are tuned to a lower scaiming resolution (100- 
400 dpi), and it encodes runlengths exceeding 2623 awkwardly. 

8.12.4 Contone data compression 

The contone layer (CMYK) is either a non-compressed bytestream or is compressed to an interleaved 
JPEG bytestream. The JPEG bytestream is complete and self-contained. It contains all data required for 
decompression, including quantization and Huffman tables. 

The contone data is optionally converted to YCcCb before being compressed (there is no specific advan- 
tage in color-space converting if not compressing). Additionally, the CMY contone pixels are optionally 
converted (on an individual basis) to RGB before color conversion using R^255-C, G=255-M, B=255-Y. 
Optional bitwise inversion of the K plane may also be performed Note that this CMY to RGB conversion 
is not intended to be accurate for display purposes, but rather for the purposes of later converting to 
YCrCb. The inverse transform will be applied before printing. 

8.1.2.4.1 JPEG compression 

The JPEG compression algorithm [23] lossily compresses a contone image at a specified quality level. It 
introduces imperceptible image degradation at compression ratios below 5:1, and negligible image degra- 
dation at compression ratios below 10:1 [29]. 
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JPEG typically first transforms the image into a color space which separates luminance and chrominance 
into separate color channels. This allows the chrominance channels to be subsampled without appreciable 
loss because of the human visual system *s relatively greater sensitivity to luminance than chrominance. 
After this first step, each color channel is compressed separately* 

The image is divided into 8x8 pixel blocks. Each block is then transfomied into the frequency domain via 
a discrete cosine transform (DCT). This transformation has the effect of concentrating image energy in rel- 
atively lower-frequency coefficients, which allows higher-fiequency coefficients to be more crudely quan- 
tized. This quan t i z a t ion is the principal source of compression in JPEG. Further compression is achieved 
by ordering coefficients by fi^quency to maximize the likelihood of adjacent zero coefficients, and then 
runlength-encoding runs of zeroes. Finally, the runlengths and non-zero frequency coefficients are entropy 
coded. Decompression is the inverse process of compression. 

8.1 .2.4.2 Non*compressed format ^ 

If the contone data is non-compressed, it must be in a block-based format bytestream with the same pixel 
order as would be produced by a JPEG decoder. Hie bytestream therefore consists of a series of 8x8 block 
of the original image, starting with the top left 8x8 block, and working horizontally across the page (as it 
will be printed) until the top rightmost 8x8 block, then the next row of 8x8 blocks (left to right) and so on 
until the lower row of 8x8 blocks (left to right). Each 8x8 block consists of 64 8-bit pixels for color plane 
0 (representing 8 rows of 8 pixels in the order top left to bottom right) followed by 64 8-bit pixels for color 
plane 1 and so on for up to a maximum of 4 color planes. 

If the original image is not a multiple of 8 pixels in X or Y, padding must be present (the extra pixel data 
will be ignored by the setting of maigins). 

8.1.2.4.3 Compressed format 

If the contone data is compressed the first memory band contains JPEG headers (including tables) plus 
MCUs (minimum coded \mits). The ratio of space between the various color planes in the JPEG stream is 
1 : 1 : 1 : 1 . No subsampling is permitted. Banding can be completely arbitrary i.e there can be multiple JPEG 
images per band or 1 JPEG image divided over multiple bands. The break between bands is only memory 
alignment based. 

8.1.2.4.4 Conversion of RGB to YCrCb (in RIP) 

YCrCb is defined as per CCIR 601-1 [20] except that Y, Grand Cb are normalized to occupy all 256 levels 
of an 8-bit binary encoding and take account of the actual hardware implementation of the inverse trans- 
forai within SoPEC. 

The exact color conversion computation is as follows: 

• Y* - (9805/32768)R + ( 1923 5/3 2768)G + (3728/32768)8 

• Or* = (16375/32768)R - (13716/32768)G - (2659/32768)8 + 128 

• Cb* « -(5529/32768)R - ( 1 0846/3 2768)G +(16375/32768)8 + 128 

Y. Cr and Cb are obtained by rounding to the nearest integer There is no need for saturation since ranges 
of Y*, Cr* and Cb* after rounding are [0-255]. [1-255] and [1-255] respectively. Note that full accuracy is 
possible with 24 bits. Ste [14] for more information. 
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9 Overview 



The Small Office Home Office Print Engine Controller (SoPEC) is a page rendering engine ASIC that 
takes compressed page images as input, and produces decompressed page images at up to 6 channels of bi- 
level dot data as output The bi-level dot data is generated for the Memjet bi-lithic printhead. The dot gen- 
eration process takes account of printhead constmction. dead nozzles, and allows for fixative generation. 
A single SoPEC can control 2 bi-lithic printheads and up to 6 color channels at 10,000 lines/scc^ equating 
to 30 pages per minute. A single SoPEC can perform full-bleed printing of A3, A4 and Letter pages. The 6 
channels of colored ink are the expected maximum in a consumer SOHO, or office Bi-lithic printing envi- 
romncnt: 

« CM Y, for regular color printing. 

• K, for black text, line graphics and gray-scale printing. 

• IR (infrared), for Netpage-enabled [5] applications. 

• F (fixative), to enable printing at high speed .Because the bi-Uthic printer is capable of printing so fast, 
a fixative may be required to enable the ink to dry before the page touches the page akeady printed! 
Otherwise the pages may bleed on each other. In low speed printing environments the fixative may not 
be required. 

. SoPEC is color space agnostic. Although it can accept contone data as CMYX or RGBX. where X is an 
optional 4th channel, it also can accept contone data in any print color space. Additionally. SoPEC pro- 
vides a mechanism for arbitrary moping of in^ut channels to output channels, including combining dots 
for ink optimization, generation of channels based on any number of other channels etc. However, inputs 
are typically CMYK for contone input, K for the bi-level input, and the optional Netpage tag dots are typ- 
ically rendered to an infta-red layer. A fixative channel is typically generated for fast printing appUcations. 

SoPEC is resolution agnostic. It merely provides a mapping between input resolutions and output i^solu- 
tions by means of scale factors. The expected output resolution is 1600 dpi, but SoPEC actually has no 
knowledge of the physical resolution of the Bi-lithic printhead. 

SoPEC \s page-length agnostic. Successive pages are typically split into bands and downloaded into the 
page store as each band of information is consumed and becomes free. 

SoPEC provides an interface for synchronization with other SoPECs. This allows simple multi-SoPEC 
solutions for simultaneous A3/A4/Letter duplex printing. However, SoPEC is also capable of printing only 
a portion of a page image. Combining synchronization functionality with paitiai page rendering allows 
multiple SoPECs to be readily combined for alternative printing requirements including simultaneous 
duplex printing and wide format printing. 

Table 8 lists some of the features and corresponding benefits of SoPEC. 
Table 8. Features and Benefits of SoPEC 







1 optimised print architecture in hardware 


30ppm fiiH page photographic quaiity color printing 
from a desictop PC 


0.13micron CMOS 
(>3 millton transistors) 


l-trgh speed 

Low cost 

High functionaltty 



1 . 10,000 lines per second equates to 30 A4/Letter pages per minute at 1600 dpi 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



Jl^ Nov 2002 
Page 35 




SoPEC : Hardware Design 



Table 8. Features and Benefits of SoPEC 







900 Millfon dots per second 


Extremely ISast page generation 


10,000 lines per second at 1600 dpi 


0.5 A4/Letter pages per SoPEC chip per second 


1 chip drives up to 133.920 norzles 


Low cost page-width printers 


1 ctifp drives up to 6 color planes 


9^ /9 ui ovfiu liiiniers can use i oOr^ci./ Cevice 


fntegrated DRAM 


No external memory required, leading to low cost 
systems 


Rower saving sleep mode 


SoPEC can enter a power saving sleep mode to 
reduce power dissipation between print Jobs 


JPEG expansron 


Low bandwidth from PC 

Low memory requirements in printer 


Lossless bitplane expansion 


High resolution text and line art with low bandwidth 
from PC (e.g. over USB) 


Netoaae taa ^cDanfifon 


Generates interactive paper 


otuunasuc oisperseo aoi oiuier 


Optically smooth image quality 
No moire effects 


Hardware compositor for 6 image planes 


Pages composited in reaMime 




Extends printhead Die and yield 
Reduces printhead cost 


Color space agnostic 


Comoatible with all ink^At^ ahH imanek «if\*trr^a*> 

Inducfing RGB. CMYK, spot, CIE L*a*b\ hex« 
achrome. YCrCbK, sRGB and other 


Color space conversion 


Higher quality / lower bandwidth 


Computer Interface 


USB1 .1 interface to Host and ISI interface to ISI- 
Bridge chip thereby allowing connection to IEEE 
1394. Bluetooth etc. 


Cascadable in resolution 


Printers of any resolution 


Cascadable fn color depth 


Special color sets e.g. hexachrome can be used 


v#ascaaawe in tmage size 


Printers of any width up to 16 inches 


Cascadat)le In pages 


Printers can print both sides simultaneously 


Cascadable in speed 


Higher speeds are possible by having each SoPEC 
print one vertical strip of the page. 


Rxative channel data generation 


Extremely fast ink drying without wastage 


Bunt-in security 


Revenue models are protected 


Undercolor removal on dot-by-dot basts 


Reduced inic usage 


Does not require fonts for high speed 

operation 


No font substitution or missing fonts 


Rexible printhead configuration 


Many configuiations of printheads are supported by 
one chip type 


Drh^s Bi-ltthic printheads directly 


No print driver chips required, results in lower cost 


Determines dot accurate Ink usage 


Removes need for physical Ink monitoring system in 
ink cartridges 



9-1 Printing rates 

The required printing rate for SoPEC is 30 sheets per minute with an inter-sheet spacing of 4 cm. To 
achieve a 30 sheets per minute print rate, this requires: 
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300iiim X 63 (dot/mm) / 2 sec » 1 05.8 fiseconds per line, with no inter-sheet gap. 
340mm x 63 (dot/mm) / 2 sec = 93.3 useconds p« line, witii a 4 cm inter-shect gap. 

tL"'^o^/V *° "^^^ °^ ""^^ [2]. At a system clock rate of 160 

MHz 13824 dots of data can be generated in 86.4 useconds. Therefore data can be generated fast enough 
to meet the prmting speed requirement It is necessary to deliver tfiis print data to the print-heads. 
Printheads can be made up of 5:5. 6:4. 7:3 and 8:2 inch printhead combinations [2]. Print data is trans- 
ferred to both prmt heads in a pair simultaneously. This means the longest time to print a line is determined 
by ae tone to transfer print data to the longest print segment. There are 9744 nozzles across a 7 inch print- 
head. The prmt data is transferred to the printhead at a rate of 1 06 MHz (2/3 of the system clock rate) per 
color plane. This means that it wiU take 91.9 (is to transfer a single line for a 7:3 printhead conflguiation 
So we can meet the requirement of 30 sheets per minute printing with a 4 cm gap with a 7:3 printhead 
combinatioi^ There are 1 1160 across an 8 inch printhead. To transfer the data to the printhead at 106 MHz 
wiU take 1053 jts. So an 8:2 pnnthead combination printing with an inter-sheet gap will print slower than 
30 sheets ner minute 



30 sheets per minute. 
9.2 SoPEC BASIC ARCHITECTURE 



From the highest point of view the SoPEC device consiste of 3 distinct subsystems 

• CPU Subsystem 

• DRAM Subsystem 

• Print Engine Pipeline (PEP) Subsystem 

See Figure 1 3 for a block level diagram of SoPEC. 



9.2.1 CPU Subsystem 



The CPU subsystem controls and configures all aspects of the other subsystems. It provides general sup- 
port for interfacmg and synchronising the external printer with the internal print engine. It also controls the 
low speed communication to the QA chips. The CPU subsystem contains various peripherals to aid the 
CPU such as GPIO (mdudes motor control), interrupt controUer. LSS Master and general timers. The 
Serial Cdrnmumcations Block (SCB) on the CPU subsystem provides a full speed USBl .1 interface to the 
Host as well as an Inter SoPEC Inter&ce ffSI) to other SoPEC devices. 

9.2.2 DRAM Subs^tem 

The DRAM subsystem accepts requests fiom the CPU. Serial Communications Block (SCB) and blocks 
within the PEP subsystem. The DRAM subsystem (in particular the DIU) arbitrates the various requests 
and determines which request should win access to the DRAM. The DIU arbitrates based on configured 
parameters to allow sufficient access to DRAM for all requestors. The DIU also hides the implemenLion 
specifics of flie DRAM such as page size, number of banks, refresh rates etc. 

9.2.3 Print Engine Pipeline (PEP) subsystem 

The Print Engine Pipeline (PEP) subsystem accepts compressed pages from DRAM and renders them to 
bi-level dots for a given print line destined for a printhead interface that communicates direcUy with up to 
2 segments of a bi-lithic printhead. w 

The firet stage of the page expansion pipeline is the CDU. LBD and TE. The CDU expands the JPEG-com- 
'"^r^ i^"'""'' (typicaUy CMYK) layer, the LBD expands the compressed bi-level layer (typically K) 
and the TE encodes Netpage tags for later rendering (typicaUy in IR or K ink). The output fiom the firsi 
stage IS a set of buffers: the CFU, SFU. and TFU. The CFU and SFU buffere are implemented in DRAM 
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The second stage is the HCU, which dithers the contone layer, and composites position tags and the bi- 
level spotO layer over the resulting bi-level dithered layer. A number of options exist for the way in which 
compositing occurs. Up to 6 channels of bi-Ievel data are produced from this stage. Note that not all 6 
channels may be present on the printhead. For example, the printhead may be CMY only, with K pushed 
into the CMY channels and IR ignored. Alternatively, the position tags may be printed in K if IR ink is not 
available (or for testing purposes). 

The third stage (DNC) compensates for dead nozzles in the printhead by color redundancy and error dif- 
fusing dead nozzle data into surrounding dots. 

The resultant bi-level 6 channel dot-data (typically CMYK-IRF) is buffered and written out to a set of line 
buffers stored in DRAM via the DWU. 

Finally, the dot-data is loaded back from DRAM, and passed to the printhead interface via a dot FIFO. The 
dot FIFO acciepts data from the LLU at the system clock rate (pclk), while the PHI removes data froin the 
I FIFO and sends it to the printhead at a rate of 2/3 times the system clock rate (see Section 9.1). 
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9.3 SoPEC Block Description 

Lookiiig at Figure 13, the various units are described here in summary form: 



Table 9. Units within SoPEC 





^^^^ 




ORAM 


DIU 


DRAM fnterface unit 


Provides the Interface for DRAM read and write access 
for the various SoPEC units. CPU and the SCB block. 
The DiU provides arbitration between competing units 

cunii Vis uriMM o^vt?s9« 


ORAM 


Embedded DRAM 


20Mbits of embedded DRAM, 


CPU 


CPU 


Central Processing Unit 


CPU for system configuration and control 


MMU 


Memory Management Unit 


Umtts access to cenatn memory address areas in CPU 
user mode 


ROU 


Real-tin^ Debug Unit 


Fadlitates the observatk>n of the contents of most of the 
CPU addressable registers In SoPEC in addition to 
some pseudo-registers In realtime. 


TIM 


General Timer 


Contains watchdog and general system tuners 


LSS 


Ljow Speed Serial Interfaces 


Low level controller for tntertadng with the OA chips 


GRIG 


General Purpose lOs 


General lO controller, with buiKH'n Motor control unit. 
LED pulse units and de-gfitch circuitry 


ROM 


Boot ROM 


1 6 KBytes of System Boot ROM code 


ICU 


Interrupt ControDer Unit 


General Purpose interrupt controller with configurable 
priority, and masking. 


CPR 


Clock, Power and Reset 
block 


Central Unit for controlling and generating the system 
clocks arKl resets and powerdown mechanisms 


PSS 


Power Save Storage 


Storage retained while system is powered down 


USB 


Universal Serial Bus Device 


USB device controller tor interfsicing with the Host USB. 


ISI 


Inter-SoPEC interface 


' ISI controiter for data and control communicatton with 
other SoPEC's in a muJti-SoPEC system 


SCB 


Serial Communication Block 


Contains both the USB and ISI btodcs. 
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Table 9. Units within SoPEC 





^3 






Print Engine 

Pipeline 

(PEP) 


PCU 


PEP controller 


Provides external CPU with the means to read and write 
PEP Unit registers, and read and write DRAM in single 
32-btt chunks. 


CDU 


Contone decoder unit 


Expands JPEG compressed contone layer and writes 
decompressed contone to DRAM 


CFU 


Contone RFO UnH 


Provides line buffering between CDU and HCU 


LBD 


Lossless Bi-level Decoder 


Expands compressed bi-level layer. 


SFU 


Spot RFO Unit 


Provides line buffering between LBD and HCU 


TE 


Tag encoder 


Encodes tag data into line of tag dots. 


TFU 


Tag FIFO Unit 


Provides tag data storage between TE and HCU 


HCU 


Halftoner compositor unit 


Dithers contone layer and composites the bi-level spot 0 
and position tag dots. 


DNC 


Dead Nozzle Compensator 


Compensates for dead nozzles by color redundancy and 
error diffusing dead nozzie data into surrounding dots. 


DWU 


Dotlina Writer Unit 


Writes out the 6 channels of dot data for a given Printline 
to the line store DRAM 


LLU 


Line Loader Unit 


Reads the expanded page image from line store, format- 
ting the data appropriately for the bi-Gthic printhead. 


PHI 


PrintHead Interface 


Is responsible t6r sending dot data to the bi-lithic print- 
heads and for providing line synchronization between 
muft^fe SoPECs. Also provides test interface to print- 
head such as tempefature monitoring and Dead Nozzle 
Identification. 



9.4 Addressing scheme in SoPEC 

SoPEC must address 

• 20 Mbit DRAM. 

• PCU addressed registers in PEP. 

• CPU-subsystem addressed registers. 

SoPEC has a unified address space with the CPU capable of addressing all CPU-subsystem and PCU-bus 
accessible registers (in PEP) and all locations in DRAM. The CPU generates byte-aligned addresses for 
the whole of SoPEC. 

22 bits are sufficient to byte address the whole SoPEC address space. 

9.4.1 DRAM addressing scheme 

The embedded DRAM is composed of 256-bit words. However the CPU-subsystem may need to write 
individual bytes of DRAM. Therefore it was decided to make the DIU byte addressable. 22 bits are 
required to byte address 20 Mbits of DRAM. 

Most blocks read or write 256-bit words of DRAM. Therefore only the top 17 bits i.e. bits 21 to 5 are 
required to address 256-bit word aligned locations. 

The exceptions are 

• CDU which can write 64-bits so only the top 1 9 address bits i.e. bits 2 1 -3 are rcqxiircd, 

• The CPU-subsystem always generates a 22-bit byte-aligned DIU address but it will send flags to the 
DIU indicating whether it is an 8, 16 or 32-bit write. 
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All DIU accesses must be within the same 256-bit aligned DRAM word 



9.4.2 



PEP Unit DRAM addressing 



PEP Unit configuration registers which specify DRAM locations should specify 256-bit aligned DRAM 
addresses i.e. using address bits 21:5. Legacy blocks from PEC I e.g. the LBD and TE may need to specify 
64-bit aligned DRAM addresses if these reused blocks DRAM addressing is difficult to modify. These 64- 
bit aligned addresses require address bits 21:3. However, these 64-bit aligned addresses should be pro- 
grammed to start at a 256-bit DRAM word boundary. 

Unlike FECI, there are no constraints in SoPEC on data organization in DRAM except that all data struc- 
tures must start on a 256-bit DRAM boundary. If data stored is not a multiple of 256-bits then the last word 
should be padded 



The CPU-bus supports 32-bit word aligned read and write accesses with variable access timings. See sec- 
tion 1 1.4 for more details of the access protocol used on this bus. The CPU-bus does not currently support 
byte reads and writes but tiiis can be added at a later date if required by imported IP. 



The PCU only supports 32-bit register reads and writes for the PEP blocks. As the PEP blocks only occupy 
a subsection of the overall address map and the PCU is explicitly selected by the MMU when a PEP block 
is being accessed the PCU does not need to perfonn a decode of the higher-order address bits. See 
Table 1 1 for the PEP subsystem address map. 



9.4.3 



CPU-bus addressed registers 



9.4.4 



PCU addressed registers in PEP 
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i3 



9.5 SoPEC Memory Map 



9.5.1 Main memory map 



The system wide memory map is shown in Figure 14 below. The memory map is discussed in detail in 
Section 1 1 Central Processing Unit (CPU). 



Accesses in this 
area are not 
allowed and 
result in a bus 
error exception. 



Accesses in this 
area are via the 
CPU bus and are 
controfled by 
permissions set in ^ 
each peripheral. 




OxFFFF_FFFF 



Accesses in this 
area are via the 
DIU bus and are 
controlled by 
permissions set In^ 
the MMU. 



PCU Mapped Registers 



Peripheral Registers 



ROM 



DRAM 



0x002A«COO0 
0x002A_0000 
0x0029.0000 
0x0028.0000 




ORAM 
Regions 



0x0000 0000 



Figure 14. Proposed SoPEC CPU memory map (not to scale) 

9.5.2 CPU-bus peripherals address map 

The address mapping for the peripherals attached to the CPU-bus is shown in Table 10 below. The MMU 
perfomis the decode of Qpu_adr[2 J: J2J to generate the relevant cpujblock^elect signal for each block. 
The addressed blocks decode however many of the lower order bits of cpu_adr[ll:2] arc required to 
address all the registers within the block. 

Table 1 0. CPU-bus peripherals address map 



MMU.base 



1^ 



TlM_base 



LSS_base 



GPJO_base 



SCB.base 



Ox0029_0000 



0x0029 1000 



0x0029.2000 



OxO029_3000 



0x0029_4000 
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Table 10. CPU-bus peripherals address map 









ICU_base 


0x0029.5000 


CPR^base 


0x0029.6000 


ROM_base 


0x0029_7000 


DlU.base 


Ox0029_8000 


PSS.base 


0x0029.9000 


Reserved 


0x0029 JVOOO to Ox0029_FFFF 


PCU.base 


0x002A.0000 to 0X002A.BFFF 



9.5.3 PCU Mapped Registers (PEP blocks) address map 

The PEP blocks are addressed via the PCU. From Figure 14, the PCU mapped registers are in the range 
Ox002A_0000 to 0x002A_BFFF. From Table 11 it can be seen that there are 12 sub-blocks within the PCU 
address space. Therefore, only four bits are necessary to address each of the sub-blocks within the PEP 
part of SoFEC. A further 12 bits may be used to address any configurable register within a PEP block. This 
gives scope for 1024 configurable registers per sub-block (the PCU mapped registers are all 32-bit 
addressed registers so the upper 10 bits are required to individually address them). This address will come 
either from the CPU or from a command stored in DRAM. The bus is assembled as follows: 

• address[15:12] = sub-block address, . 

• addresstn:2J register address within sub-block, only the number ofbits required to decode the regis- 
ters within each sub-block are used, 

• address[l :0] = byte address, unused as PCU mapped registers are all 32-bit addressed registers. 

So for the case of the HCU, its addresses range from 0x7000 to 0x7FFF within the PEP subsystem or from 
Ox002A_7000 to 0x002A_7FFFF in the overall system. 



Table 1 1 . PEP blocks address map 







PCU.base 


Ox002A_0000 


CDU.base 


0x002A_1000 


CFU_base 


Ox002A_2000 


LBOjMse 


OX002A.3000 


SFU^base 


Ox002A_4000 


TE_base 


Qx002A^5000 


TFU_base 


Ox002A_6000 


HCU^base 


0x002A_7000 


DNC.base 


0x002A_8000 


DWU^base 


0x002A_9000 


LLU_baso 


0xOO2A_A000 


PHI.base 


OxOO2A-B0O0 to Ox002A^BFFF 
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9.6 Buffer management in SoPEC 

As outlined in Section 9.1. SoPEC has a requirement to print 1 side every 2 seconds i.e. 30 sides per 
minute. 

9.6.1 Page buffering 

Approximately 2 Mbytes of DRAM are reserved for compressed page buffering in SoPEC. If a page is 
compressed to fit within 2 Mbyte then a complete page can be transferred to DRAM before printing. How- 
ever, the time to transfer 2 Mbyte using USB 1.1 is approximately 2 seconds. The worst case cycle time to 
print a page then ^preaches 4 seconds. This reduces the worst-case print speed to 15 pages per minute. 



9.6.2 Band buffering 

The SoPEC page-expansion bloclcs support the notion of page banding. The page can be divided into 
bands and another band can be sent down to SoPEC while we are printing the current band. 
Therefore we can start printing once at least one band has been downloaded. 

The band size granularity should be carefully chosen to allow efficient use of the USB bandwidth and 
DRAM buffer space. It should be small enough to allow seamless 30 sides per minute printing but not so 
smaU as to introduce excessive CPU overhead in orchestrating the data transfer and parsing the band head- 
ers. Band-finish interrupts have been provided to notify the CPU of free buffer space. It is likely that the 
Host PC will supervise the band transfer and buffer management instead of the SoPEC CPU. 

If SoPEC starts printing before the complete page has been transferred to memory there is a risk of a buffer 
undenrun occurring if subsequent bands are not transferred to SoPEC in time e.g. due to insufficient USB 
bandwidth caused by another USB peripheral consuming USB bandwidth. A buffer undenun occurs if a 
line synchronisation pulse is received before a line of data has been transferred to the printhead and causes 
the print job to fail at that line. If there is no risk of buffer underrun then printing can safely start once at 
least one band has been downloaded. 

If there is a risk of a buffer undemm occurring due to an interruption of compressed page data transfer, 
then the safest ^proach is to only start printing once we have loaded up the data for a complete page. This 
means that a worst case latency in the region of 2 seconds (with USB 1.1) will be incurred before printing 
the first page. Subsequent pages will take 2 seconds to print giving us the required sustained printing rate 
of 30 sides per minute. 

A Storage SoPEC (Section 7.2.5) could be added to the system to provide guaranteed bandwidth data 
delivery. The print system could also be constructed using an ISI-Bridge chip (Section 7.2.6) to provide 
guaranteed data delivery. 

The most efficient page banding strategy is likely to be detennined on a per page/ print job basis and so 
SoPEC will support the use of bands of any size. 
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10 SoPEC Use Cases 

10.1 Introduction 

This chapter is intended to give an ovenriew of a representative set of scenarios or use cases which SoPEC 
can perfomi, SoPEC is by no means restricted to the particular use cases described here. 

In this chapter we discuss SoPEC use cases under four headings: 

1) Normal operation use cases. 

2) Security use cases. 

3) Miscellaneous use cases. 

4) Failure mode use cases. 

Use cases for both single and multi*SoPEC systems are outlined. 
Some tasks may be composed of a number of sub-tasks. 

The realtime requirements for SoPEC software tasks are discussed in "Central Processing Unit (CPU)" 
under Section 1 1.3 Realtime requirements. 

10.2 Normal operation in a single SoPEC System with USB Host connection 

SoPEC operation is broken up into a number of sections which are outlined below. Buifer management in 
a SoPEC system is notmally perfonned by the Host. 

10.2.1 Powerup 

Powenip describes SoPEC initialisation foUowing an external reset or the watchdog timer system reset, 
A typical powemp sequence is: 

1 ) Execute reset sequence for complete SoPEC. 

2) CPU boot from ROM. 

3) Basic configuration of CPU peripherals, SCB and DIU. DRAM initialisation. USB Wakeup. 

4) Download and authentication of program (see Section 10.5.2). 

5) Store reusable authentication results in Power-Safe Storage (PSS). 

6) Execution of program from DRAM. 

7) Retrieve operating parameters from PRINTER^QA and autiienticate operating parameteis. 

8) Download and authenticate any further datasets. 

10.2.2 USB wakeup 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
(chapter 16). Normally the CPU sub-system and the DRAM wiU be put in sleep mode but the SCB and 
power-safe stoiage (PSS) will still be enabled 

Wakeup describes SoPEC recovery from sleep mode with the SCB and power-safe storage (PSS) still 
enabled. In a single SoPEC system, wakeup can be initiated following a USB reset from the SCB. 
A typical USB wakeup sequence is: 

1 ) Execute reset sequence for sections of SoPEC in sleep mode. 

2) CPU boot from ROM, if CPU-subsystem was in sleep mode. 

3) Basic configuration of CPU peripherals and DIU. and DRAM initialisation, if required. 
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4) Download and authentication of program using results in Power-Safe Storage (PSS) (see Section 
10,5,2). 

5) Execution of program from DRAM. 

6) Retrieve operating parameters from PRINTER_QA and authenticate operating parameters. 

7) Download and authenticate using results in PSS of any faithcT datasets (programs). 

10.2.3 Print initialization 

This sequence is typically performed at the start of a print job following powerup or wakeup: 

1) Checkamountof ink remaining via Q A chips. 

2) Download static data e.g. dither matrices, dead nozzle tables from Host to DRAM. 

3) Check printhead temperature^ if required, and configure printhead with firing pulse profile etc. 
accordingly. 

4) Initiate printhead pxe-heat sequence, if required. 

10.2.4 First page download 

Buffer management in a SoPEC system is normally performed by the Host. 
First page, first band download and processing: 

1) The Host communicates to the SoPEC CPU over the USB to check that DRAM space remaining is 
sufficient to download the first band. 

2) The Host downloads the first band (with the page header) to DRAM. 

3) When die complete page header has been downloaded the SoPEC CPU processes the page header, 
calculates PEP register commands and writes directly to PEP registers or to DRAM. 

4) If PEP register commands have been written to DRAM, execute PEP commands from DRAM via 
PCU. 

Remaining bands download and processing: 

1 ) Check DRAM space remaining is suflBcient to download the next band. 

2) . Dowidoad the next band with the band header to DRAM. 

3) When the complete band header has been downloaded, process the band header according to 
whichever band-related register updating mechanism is being used. 

10.2.5 ' Start printing 

1 ) Wait imtil at least one band of the first page has been downloaded. 

One approach is to only start printing once we have loaded up the data for a complete page. If we 
start printing before the complete page has been transferred to memory we tun the risk of a buffer 
underrun occurring because compressed page data was not transferred to SoPEC in time e.g. due to 
insufficient USB bandwidth caused by another USB peripheral consuming USB bandwidth. 

2) Start all the PEP Units by writing to their Go registers, via PCU commands executed fi-om DRAM 
or direct CPU writes. A rapid startup order for the PEP units is outlined m Table 1 2. 



Table 12. Typical PEP Unit startup order for printing a page. 







1 


DNC 


2 


DWU 


3 


HCU 
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J3 



Table 12. Typical PEP Unit startup order for printing a page 





4 


PHI 

* 


5 


LLU 


6 


CFU. SFU.Tf=U 


7 


CDU 


6 


TE. LBD 



Print ready interrupt occurs (from PHI). 

Start motor control, if first page, otherwise feed the next page. This step could occur before the prim 
ready interrupt. 

5) Drive LEDs, monitor paper status. 

€) Wait for page alignment via page sensor(s) GPIO intemapt 

7) CPU instructs PHI to start producing line syncs and hence commence printing, or wait for an exter- 
nal device to produce line syncs. 

8) Continue to download bands and process page and band headers for next page. 



3) 
4) 



10^.e Next page(s) download 

As for first page download, performed during printing of current page. 

10.2.7 Between bands 

When the finished band flags arc asserted band related registers in the CDU, LBD, TE need to be re-pro- 
grammed before the subsequent band can be printed. This can be via PCU commands from DRAM. Typi- 
cally only 3-5 commands per decompression unit need to be executed. These registers can also be 
reprogrammed directly by the CPU or most likely by iqjdating fcom shadow registers. The finished band 
flag intemqits the CPU to tell the CPU that the area of memory associated with the band is now free. 

10.2.8 During page print 

Typically during page printing ink usage is communicated to the QA chips. 

1) Calculate ink printed (from PHI). 

2) Decrement ink remaining (via QA chips). 

3) Check amount of ink remaining (via QA chips). This operation may be better performed while the 
page is being printed rather than at the end of the page. 

10.2.9 Page finish 

These operations are typically performed when the page is finished: 

1) Page finished intemq)t occurs from PHI. 

2) Shutdown the PEP blocks by de-asserting their Go registers. A typical shutdown order is defined in 
Table 13. This wiU set the PEP Unit state-machines to their idle states without resetting their config- 
urahon registers. 

3) Communicate ink usage to QA chips, if required. 
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Table 13. 


End of page shutdown order for PEP Units (TBO). 




1 


PHI (will shutdown by itself In the normal case at the end of a page) 


2 


DWU (shutting this down stalls the DNC and therefore the HCU and above) 


3 


U.U (should already be halted due to PHI at end of last tine of page) 


4 


T£ (this ts the only dot supplier likely to be running, halted by the HCU) 


5 


CDU (this is likely to already be halted due to end of contone band) 


6 


CPU. SFU, TPU. LBO (order unimportant, and should already be halted due to end of 
band) 


7 


HCU. DNC (order unimportant should already have hatted) 



10«2.10 Start of next page 

These operatioiis are typically perfonned before printing the next page: 

1) Re-program the PEP Units via PCU command processing from DRAM based on page header. 

2) Go to Start printing. 

1 0.2.1 1 End of document 

1) Stop motor control. 

10.2.12 Powerdown 

In this mode SoPEC is no longer powered. 

1) Instruct Host PC via USB that SoPEC is about to power down. 



10.2.13 Sleep 



The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
described in Section 16. 

1) Instruct Host PC via USB that SoPEC is about to sleep. 

2) Put SoPEC into defined sleep mode. 
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10.3 Normal operation in a Multi-SoPEC System - ISIMaster SoPEC 

In a multi-SoPEC system the Host generally manages program and compressed page download to all the 
SoPECs. Inter-SoPEC communication is over the IS! link which will add a latency. 

In the case of a multi-SoPEC system with a USB 1.1 connection, the SoPEC with the USB connection is 
the ISIMaster. The ISI-bridge chip is the ISIMaster in the case of an ISI-Bridge SoPEC configuiation. 

In a multi-SoPEC system one of the SoPECs will be the PrintMaster. This SoPEC must manage and con- 
trol sensors and actuators e.g. motor control. These sensors and actuators could be distributed over all the 
SoPECs in the system. An ISIMaster SoPEC may also be the PrintMaster SoPEC. 

I In a multi-SoPEC system each printing SoPEC will generally have its own PRINTER^QA chip (or at least 
access to a PRINTER_QA chip that contains the SoPEC*s SOPECJd.key) to validate operating parame- 
ters and ink usage. The results of these operations may be communicated to the PrintMaster SoPEC. 
In general the ISIMaster may need to be able to: 

• Send messages to the ISISIaves which will cause the ISISlaves to send their status to the ISIMaster. 

• Instruct the ISISIaves to perform certain operations. 

As die ISI is an insecure interface commands issued over the ISI are regarded as user mode commands. 
Supervisor mode code running on the SoPEC CPUs will allow or disallow these commands. The software 
protocol needs to be constructed with this in mind. 

Existing requirements indicate that it is sufficient for the ISIMaster to mitiate all communication with the 
ISISIaves. 

SoPEC operation is broken up into a number of sections which are outlined below. 
10.3.1 Powerup 

Powenip describes SoPEC initialisation following an external reset or the watchdog timer system reset. 

1) Execute reset sequence for complete SoPEC. 

2) CPU boot from ROM. 

3) Basic configuration of CPU peripherds. SCB and DIU. DRAM initialisation USB Wakeup 

14) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 
5) Download and authentication of program (see Section 10.53). 

6) Store reusable cryptographic results in Power-Safe Storage (PSS). 

7) Execution of program from DRAM. 

8) Retrieve operating parameters from PRINTER_QA and authenticate operating parameteis. 

9) Download and authenticate any further datasets (programs). 

I 10)The initial dataset may be broadcast to all the ISISIaves. 

1 l)ISIMa5ter master SoPEC then waits for a short time to allow the authentication to take place on the 
ISISlave SoPECs. 

12) Each ISISlave SoPEC is polled for the result of its program code authentication process. 

13) If all ISISIaves report successful authentication the OEM code module can be distributed and 
authenticated. OEM could will most likely reside on one SoPEC. 

10.3^ USB wakeup 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
[16]. Normally the CPU sub-system and the DRAM will be put in sleep mode but the SCB and power-safe 
storage (PSS) will still be enabled. 
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Wakeup describes SoPEC recovery from sleep mode with the SCB and power-safe storage (PSS) still 
enabled. For an ISIMaster SoPEC, wakeup can be initiated following a USB reset from the SCB. 

A typical USB wakeup sequence is: 

1 ) Execute reset sequence for sections of SoPEC in sleep mode. 

2) CPU boot from ROM, if CPU-subsystem was in sleep mode. 

3) Basic configuration of CPU peripherals and DIU, and DRAM initialisation, if required- 

4) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

5) Etownload and authentication of program using results in Power-Safe Storage (PSS) (see Section 
10.5.3). 

6) Execution of program from DRAM. 

. 7) Retrieve operating parameters from PRINTER_QA and authenticate operating parameters. 

8) Download and authenticate any further datasets programs) using results in Power-Safe Storage 
(PSS) (see Section 10.5.3). 

9) Following steps as per Powerup. 

10.3.3 Print initialization 

This sequence is typically performed at the start of a print job following powenq> or walceup: 

1) Check amount of ink remaining via QA chips which may be present on a ISISlave SoPEC. 

2) Download static data e.g. dither matrices, dead nozzle tables from Host to DRAM. 

3) Check printhead temperature, if required, and configure printhead with firing pulse profile etc. 
accordingly. Instruct ISISlaves to also perform this operation. 

4) Initiate printhead pre-heat sequence, if required. Instruct ISISlaves to also perfonn this operation 

10.3.4 First page download 

Buffer management in a SoPEC system is nomially performed by the Host. 

1) The Host communicates to the SoPEC CPU over the USB to check that DRAM space remaining is 
sufiicient to download the first band. 

2) The Host downloads the first band (with the page header) to DRAM. 

3) When the complete page header has been downloaded the SoPEC CPU processes the page header, 
calculates PEP register conunands and write directly to PEP registers or to DRAM. 

4) If PEP register commands have been written to DRAM, execute PEP commands from DRAM via 
PCU. 

Poll ISISlaves for DRAM status and download compressed data to ISISlaves. 

Remaining first page bands download and processing: 

1 ) Check DRAM space remaining is sufficient to download the next band 

2) Download the next band with the band header to DRAM. 

3) When the complete band header has been downloaded, process the band header according to 
whichever band-related register updating mechanism is being used. 

Poll ISISlaves for DRAM status and download compressed data to ISISlaves. 

10.3.5 Start printing 

1) Wait until at least one band of the first page has been downloaded. 

2) Start all the PEP Units by writing to their Go registers, via PCU commands executed from DRAM 
or direct CPU writes, in the suggested order defined in Table 12. 

3) Print ready interrupt occurs (from PHI). Poll ISISlaves until print ready interrupt. 
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4) Start motor control (which may be on an ISISlaves SoPEC), if first page, otheiwise feed the next 
page. This step could occur before the print ready interrupt. 

5) Drive LEDS, monitor paper status (which may be on an ISISlaves SoPEQ. 

6) Wait for page alignment via page sensor(s) GPIO intemipt (which may be on an ISISlaves SoPEC). 

7) CPU instructs PHI to start producing master line syncs, or wait for an external device to produce 
line syncs. 

8) Continue to download bands and process page and band headers for next page. 



When the finished band flags are asserted band related registers in the CDU, LBD and TE need to be re- 
programmed. This can be via PCU commands from DRAM. Typically only 3-5 commands per decom- 
pression unit need to be executed. These registers can also be reprogrammed directly by the CPU or by 
updating from shadow registers. The finished band flag interrupts to the CPU, tell the CPU that the area of 
memory associated with the band is now firee. 



Typically during page printing ink usage is communicated to the QA chips. 

1) Calculate ink printed (fix)m PHI). 

2) Decrement ink remaining (via Q A chips). 

3) Check amount of ink remaining (via QA chips). This operation may be better performed while tiie 
page is being printed rather than at the end of the page. 



These operations are typically performed when the page is finished: 

1) Page finished interrupt occurs firom PHI. Poll ISISlaves for page finished interrupts. 

2) Shutdown the PEP blocks by de-asserting their Go registers in the suggested order in Table 13. This 
will set the PEP Unit state-machines to their startup states. 

3) Communicate ink usage to QA chips, if required 



These operations are typically performed before printing the next page: 

1) Re-program the PEP Units via PCU conmiand processing from DRAM based on page header. 

2) Go to Start printing. 



1 0.3.6 Next page(s) download 

As for first page download, performed during printing of current page. 



10.3,7 



Between bands 



10.3.8 During page prrnt 



10.3.9 Page finish 



10.3.10 Start of next page 



10.3.1 1 End of document 



I) Stop motor control. This may be on an ISISlave SoPEC. 



10.3.12 Powerdown 



In this mode SoPEC is no longer powered 

1) Instruct Host PC via USB that SoPEC system is about to power down. 

2) Instmct ISISlave SoPECs to powerdown. 

3) Powerdown ISIMaster SoPEC, 
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10.3.13 Sleep 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
[161. 

1) Instruct Host PC via USB which parts of SoPEC system are about to sleep. 

2) Put defined SoPECs into defined sleep modes. 
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10.4 Normal operation in a MulthSoPEC System - ISISlave SoPEC 

This section the outline typical operation of an ISISlave SoPEC in a miilti-SoPEC system. The ISIMaster 
can be another SoPEC or an ISI-Bridge chip. The ISISlave communicates with the Host via the ISIMaster. 
Buffer management in a SoPEC system is normally performed by the Host. 

10.4.1 Powerup 

Powerup describes SoPEC initialisation following an external reset or the watchdog timer system reset. 

A typical poweiup sequence is: 

1) Execute reset sequence for complete SoPEC. 

2) CPU boot from ROM. 

3) Basic configuration of CPU peripherals, SCB and DIU. DRAM initialisation. 

4) Download and authentication of program (see Section 1 0.5.3). 

5) Store reusable cryptographic results in Power-Safe Storage (PSS). 

6) Execution of program from DRAM. 

7) Retrieve operating parameters from PRINTER_QA and authenticate operating parameters. 

8) SoPEC identification by sampling GPIO pins to detennine ISIId. Communicate ISild to ISIMaster. 
9} Download and authenticate any further ^a5e&. 

10.4.2 ISI wakeup 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
[16]. Normally the CPU sub-system and the DRAM will be put in sleep mode but the SCB and power-safe 
storage (PSS) will still be enabled. 

Wakeup describes SoPEC recovery from sleep mode with the SCB and power-safe storage (PSS) still 
enabled: In an ISISlave SoPEC, wakeup can be initiated following an ISI reset from the SCB. 

A typical ISI wakeup sequence is: 

1) Execute reset sequence for sections of SoPEC in sleep mode. 

2) CPU boot from ROM» if CPU-subsystem was in sleep mode. 

3) Basic configuration of CPU peripherals and DIU, and DRAM initialisation, if required 

4) Download and authentication of program using results in Power-Safe Storage (PSS) (see Section 
10.5.3). 

5) Execution of program from DRAM. 

6) Retrieve operating parameters from PRINTER^QA and authenticate operating parameters. 

7) SoPEC identification by sampling GPIO pins to determine ISlId. Communicate ISIId to ISIMaster. 

8) Download and authenticate any further datasets. 

10.4.3 Print fnitiafizatlon 

This sequence is typically performed at the start of a print job follovdng powerup or wakeup: 

1) Check amount of ink remaining via QA chips. 

2) Download static data e.g. dither matrices, dead nozzle tables from ISIMaster to DRAM. 

3) Check printhead temperature, if required, and configure printhead with firing pulse profile etc. 
accordingly. 

4) Initiate printhead pre-heat sequence, if required. 
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10.4.4 First page download 

Buffer management in a SoPEC system is noraially performed by the Host via the ISIMaster. 

1) Check DRAM space remaining is sufficient to download the first band 

2) The Host downloads the first band (with the page header) to DRAM via the ISIMaster. 

3) When the complete page header has been downloaded, process the page header, calculate PEP reg- 
ister commands and write directly to PEP registers or to DRAM. 

4) If PEP register conmiands have been written to DRAM, execute PEP commands from DRAM via 
PCU 

Remaining first page bands download and processing: 

1 ) Check DRAM space remaining is sufficient to download the next band. 

2) The Host downloads the first band (with the page header) to DRAM via the ISIMaster. 

3) When the complete band header has been downloaded, process the band header according to 
whichever band-related register updating mechanism is being used. 

10-4.5 Start printing 

1) Wait until at least one band of the first page has been downloaded. 

2) Start all the PEP Units by writing to their Go registers, via PCU commands executed from DRAM 
or direct CPU writes, in the order defined in Table 1 2. 

3) Print ready interrupt occurs (from PHI). Conmiunicate to ISIMaster via ISI link. 

4) Start motor control, if attached to this ISISlave. when requested by ISIMaster, if first page, other- 
wise feed next page. This step coiild occur before the print ready intemipt 

5) Drive LEDS, monitor paper status, if on this ISISlave SoPEC, when requested by ISIMaster 

6) Wait for page alignment via page scnsor(s) GPIO interrupt, if on tiiis ISISlave SoPEC, and send to 
ISIMaster. 

7) Wait for line sync and conmience printing- 

8) Continue to download bands and process page and band headers for next page. 

10.4.6 Next page(s) download 

As for first band download, performed during printing of current page. 

10.4.7 Between bands 

When the finished band flags are asserted band related registers in the CDU, LBD and TE need to be re- 
programmed. This can be via PCU commands from DRAM. Typically only 3-5 commands per decom- 
pression unit need to be executed These registers can also be leprogrammed directly by the CPU or by 
updating from shadow registers. The finished band flag interrupts to the CPU tell the CPU that tiie area of 
memory associated with the band is now free. 

10.4.8 During page print 

Typically during page printing ink usage is communicated to the QA chips. 

1) Calculate ink printed (from PHI). 

2) Decrement ink remaining (via QA chips). 

3) Check amount of ink remaining (via QA chips). This operation may be better perforaxed while the 
page is being printed rather than at the end of the page. 
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10.4.9 Page finish 



These 



operations are typically performed when the page is iimshed: 

Page finished interrupt occurs from PHI. Communicate page finished interrupt to ISfMaster. 
Shutdown the PEP blocks by de-asserting their Go registers in the suggested order in Table 13. This 
will set the PEP Unit state-machines to their startup states. 
Conununicate ink usage to Q A chips, if required. 



I) 
2) 



3) 



10.4.10 Start of next page 



These operations are typically performed before printing the next page: 

1) Re-program the PEP Units via PCU command processing from DRAM based on page header. 

2) Go to Start printing. 



1 0.4.1 1 End of docuntent 

Stop motor control, if attached to this ISISlave, when requested by ISIMaster 

10.4.12 Powerdown 

In this mode SoPEC is no longer powered. 

1) Powerdown ISISlave SoPEC when instructed by ISIMaster. 

10.4.13 Sleep 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 



[16]. 



1) Put SoPEC into defined sleep modes. 
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10.5 Security Use Cases 

Please see the 'SoPEC Security Overview' [9] document for a more complete description of SoPEC secu- 
rity issues. The SoPEC boot operation is described in the ROM chapter of the SoPEC hardware design 
specification. Section 1 7.2; 

10.5.1 Communication with the OA chips 

Communication between SoPEC and the QA chips (i.e. INK_QA and PRJNTER.QA) wiir take place on 
at least a per power cycle and per page basis. Communication with the QA chips has three principal pur- 
poses: validating the presence of genuine QA chips (i.e the printer is using approved consumables), valida- 
tion of the amount of ink remaining in the cartridge and authenticating the operating parameters for the 
printer. After each page has been printed, SoPEC is expected to communicate the number of dots fired per 
ink plane to the QA chipset. SoPEC may also initiate decoy communications with the QA chips from time 
to time. 

Process: 

• When validating ink consumption SoPEC is expected to principally act as a conduit between the 
PRINTER_QA and INK^QA chips and to take certain actions (basically enable or disable printing and 
report status to Host PC) based on the result The communication channels are insecure but all traffic is 
signed to guarantee audienticity. 

Known Weaknesses 

• All communication to the QA chips is over the LSS interfaces using a serial communication protocol. 
This is open to observation and so the communication protocol could be reverse engineered. In this 
case both the PRINTER^QA and INK^QA chips could be replaced by impostor devices (e.g. a smgle 
FPGA) that successfully emulated the communication protocol. As this would require physical modifi- 
cation of each printer this is considered to be an acceptably low risk. Any messages that are not signed 
by one of the symmetric keys (such as the SoPEC_id_key) could be reverse engineered. The imposter 
device must also have access to the appropriate teys to crack the system. 

• If the secret keys in the QA chips are exposed or cracked then the system, or parts of it, is compro- 
mised. 

Assumptions: 

til The QA chips are not involved in the authentication of downloaded SoPEC code 

[ 2 ] The Q A chip in the ink cartridge (INK_QA) does not direcdy affect the operation of the cartridge in 

any way i.e. it does not inhibit the flow of ink etc. 
13] The INK^QA and PRINTER^QA chips are identical in their virgin state. They only become a 

INK^QA or PRINTER^QA after their HashROM has been programmed. 



10.5.2 Authentication of downloaded code in a single SoPEC system 
Process: 

1) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

2) The program is downloaded to the embedded DRAM. 

3) The CPU calculates a SHA-1 hash digest of the downloaded program. 

4) The ResetSrc register in the CPR block is read to detennine whether or not a power-on reset 
occurred. 

5) If a power-on reset occurred the signature of the downloaded code (which needs to be in a known 
location such as the first or last N bytes of the downloaded code) is decrypted using the Silverbrook 
public bootOkey stored in ROM. This decrypted signature is the expected SHA-1 hash of the 

accompany ing program. The encryption algorithm is likely to be a public key algorithm such as 
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RSA. If a power-on reset did not occur then the expected SHA-1 hash is retrieved from the PSS and 
the compute intensive decryption is not required. 

6) The calculated and expected hash values are compared and if they match then the programs authen- 
ticity has been verified. 

7) If the hash values do not match then the Host PC is notified of the failure and software may decide 
to put the SoPEC device into powerdown mode. 

8) If the hash values match then the CPU starts executing the downloaded program. 

9) If, as is very likely, the downloaded program wishes to download subsequent programs (such as 
OEM code) it is responsible for ensuring the authenticity of everything it downloads. The down- 
loaded program may contain public keys that are used to authenticate subsequent downloads, thus 
forming a hierarchy of authentication. The SoPEC ROM does not control these authentications - it 
is solely concerned with verifying that the first program downloaded has come from a trusted 



lO)At some subsequent point OEM code starts executing. The Silverbrook si^ervisor code acts as an 
O/S to the OEM user mode code. The OEM code must access most SoPEC functionality via system 
calls to the Silverbrook code. 

1 l)The OEM code is expected to perform some simple 'turn on the lights* tasks after which the Host 
PC is informed that the printer is ready to print and the Start Printing use case comes into play. 
Known Weaknesses: 

• If the Silverbrook private bootOkey is exposed or cracked then the system is seriously compiomised. A 
ROM mask change would be required to reprogram the bootOkey. 



1) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

2) The SCB is configured to broadcast the data received from the Host PC. 

3) The program is downloaded to the embedded DRAM and broadcasted to all ISISlave SoPECs over 
thelSl. 

4) The CPU calculates a SHA-1 hash digest of the downloaded program. 

5) The ResetSrc register in the CPR block is read to determine whether or not a power-on reset 



6) If a power-on reset occurred the signature of the downloaded code (which needs to be in a known 
location such as the first or last N bytes of the downloaded code) is decrypted using the Silverbrook 
public bootOkey stored in ROM. This deciypted signature is the expected SHA-1 hash of the 
accompanying program. The encryption algorithm is likely to be a public key algorithm such as 
RSA. If a power-on reset did not occur then the expected SHA-1 hash is retrieved from the PSS and 
the compute intensive decryption is not required. 

7) The calculated and expected hash values are compared and if they match then the programs authen- 
ticity has been verified. 

8) If the hash values do not match then the Host PC is notified of the failure and software may decide 
to put the SoPEC device into powerdown mode. 

9) If the hash values match then the CPU starts executing the downloaded program. 

10) It is likely that the downloaded program will poll each ISISlave SoPEC for the result of its authenti- 
cation process and to determine the number of slaves present. 

1 1) If any slave reports a failed authentication then the ISIMaster communicates this to the Host PC and 
puts itself into powerdown mode. 



source. 



10.5.3 Authentication of downloaded code in a multf«SoPEC system 



10.5.3.1 iSMaster SoPEC Process: 



occurred- 
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12) If all ISISlaves report successful authentication then the downloaded program is responsible for the 
downloading, authentication and distribution of subsequent programs within the multi-SoPEC sys- 
tem. 

13) At some subsequent point OEM code starts executing. The Silverbrook supendsor code acts as an 
O/S to the OEM user mode code. The OEM code must access most SoPEC functionality via system 
calls to the Silverbrook code. 

14) The OEM code is expected to perform some simple *tum on the lights' tasks after which the master 
SoPEC determines that all SoPECs are ready to print. The Host PC is informed that the printer is 
ready to print and the Start Printing use case comes into play. 



10.5.3.2 iSiSiave SoPEC Process: 

1) When the CPU comes out of reset the SCB should still be in slave mode, and the SCB is already 
configured to receive data from the ISIMaster. 

2) The program is downloaded to embedded DRAM. 

3) The CPU calculates a SHA-1 hash digest of the downloaded program. 

4) The ResetSrc register in the CPR block is read to determine whether or not a power-on reset 
occurred. 

5) If a power-on reset occurred the signature of the downloaded code (which needs to be in a known 
location such as the first or last N bytes of the downloaded code) is decrypted using the Silverbrook 
public bootOkey stored in ROM. This deciypted signature is the expected SHA-1 hash of the 
accompanying program. The encryption algorithm is likely to be a public key algorithm such as 
RSA. If a power-on reset did not occur then the expected SHA-1 hash is retrieved from the PSS and 
the conipute intensive decryption is not required. 

6) The calculated and expected hash values are compared and if they match then the programs authen- 
ticity has been verified. 

7) If the hash values do not match, then the ISlSlave device will await a new program again, eventu- 
ally timing out and powering dowxL 

8) If the hash values match then the CPU starts executing the downloaded program. 

9) It is likely that the downloaded program will communicate the result of its authentication process to 
the ISIMaster. The downloaded program is responsible for determining the SoPECs ISIId. receiving 
and authenticating any subsequent programs. 

10) At some subsequent point OEM code starts executing. The Silverbrook supervisor code acts as an 
O/S to the OEM user mode code. The OEM code must access most SoPEC functionality via system 
calls to the Silverbrook code. 

1 1) The OEM code is expected to perform some simple •turn on the lights' tasks after which the master 
SoPEC is informed that this slave is ready to print. The Start Printing use case then comes into play. 

Known Weaknesses 

• If the Silverbrook private bootOkcy is exposed or cracked tiien the system is seriously compromised. 

• ISI is an open interface i.e. messages sent over the ISI are in the clear. The communication channels 
are insecure but all traffic is signed to guarantee authenticity. As all communication over the ISI is con- 
trolled by Supervisor code on both the ISIMaster and ISISlavc then this also provides some protection 
against software attacks. 
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10.5.4 Authentication and upgrade of operating parameters for a printer 



The SoPEC IC will be used in a range of printers with different cq>abilities (e.g. A3/A4 printing, printing 
speed, resolution etc.). It is expected that some printers will also have a software upgrade capability which 
would allow a user to purchase a license that enables an upgrade in their printer's c^abilities (such as 
print speed). To facilitate this it must be possible to securely store the operating parameters in the 
PRINTER^QA chip, to securely communicate these parameters to the SoPEC and to securely reprogram 
the parameters in the event of an upgrade. Note that each printing SoPEC (as opposed to a SoPEC that is 
only used for the storage of data) will have its own PRJNTER_QA chip (or at least access to a 
PRINTER_QA that contains the SoPEC's SoPEC Jd.kcy). Therefore both ISIMaster and ISISlave 
SoPECs will need to authenticate operating parameters. 



1) Program code is downloaded and authenticated as described in sections 10.5.2 and 10.5.3 above. 

2) The program code has a function to create the SoPEC Jd_key from the unique SoPECJd that was 
progranuned when the SoPEC was manufacUired. 

3) The SoPEC retrieves the signed operating parameters from its PRINTER^QA chip. The 
PRINTER^QA chip uses the SoPEC_id_key (which is stored as part of the pairing process cxe- 
cuted during printhead assembly manufacture & test) to sign the operating parameters which are 
appended with a random number to thwart replay attacks. 

4) The SoPEC checks the signature of the operating parameters using its SoPEC_id_kcy. If this signa- 
ture authentication process is siiccessful then the operating parameters are considered valid and the 
overall boot process continues. If not the error is reported to the Host PC. 

5) Operating parameters may also be set or upgraded using a second key, the PrintEngineLicenseJoey, 
which is stored on the PRINTER^QA and used to authenticate the change in operating parameter. 

Known Weaknesses: 

• It may be possible to retrieve the unique SoPEC_id by placing the SoPEC in test mode and scanning it 
out. It is certainly possible to obtain it by reverse engineering the device. Either way the SoPECJd 
(and by extension the SoPEC_id_kcy) so obtained is valid only for that specific SoPEC and so printers 
may only be compromised one at a time by parties with the appropriate specialised equipment. Fur- 
thermore even if the SoPEC.id is compromised, the other keys in the system, which protect the 
authentication of consumables and of program code, are unaffected. 



Process: 
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S5 



10.6 Miscellaneous Use Cases 



There are many miscellaneous use cases such as the following examples. Software running on the SoPEC 
CPU or Host will decide on what actions to take in these scenarios. 

10.6.1 Disconnect / Re-connect of OA chips. 

1) Disconnect of a QA chip between documents or if ink runs out mid-document. 

2) Re-connect of a QA chip once authenticated e.g. ink cartridge replacement should allow the system 
to resume and print the next docimient 

10.6.2 Page arrives before print ready interrupt. 

1) Engage clutch to stop paper until print ready intenupt occurs. 

10.6.3 Oead-nozzle table upgrade 

This sequence is typically performed when dead nozzle information needs to be updated by performing a 
printhead dead nozzle test. 

1) Run printhead nozzle test sequence 

2) Either Host or SoPEC CPU converts dead nozzle information into dead nozzle table. 

3) Store dead nozzle table on Host. 

4) Write dead nozzle table to SoPEC DRAM. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



2g Nov 2002 
Page 61 




SoPEC : Hardware Design 




10.7 



Failure Mode Use Cases 



10.7.1 



System errors and security violations 



System errors and security violations are reported to the SoPEC CPU and Host. Software ninning on the 
SoPEC CPU or Host will then decide what actions to take. 

Silverbrook code authentication failure. 

1) Notify Host PC of authentication failure. 

2) Abort print run. 

'OEM code authentication failure. 

1) Notify Host PC of authentication failure. 

2) Abort print tun. 

Invalid QA chip(s). 

1) Report to Host PC. 

2) Abort print run. 

MMU security violation intemipt * 

1) This is handled by exception handler. 

2) Report to Host PC 

3) Abort print run. 

Invalid address tntem^t from PCU. 

1) This is handled by exception handler. 

2) Report to Host PC. 

3) Abort print run. 

Watchdog timer interrupt. 

1) This is handled by exception handler. 

2) Report to Host PC. 

3) Abort print run. 

Host PC does not adcnowledge message thai SoPEC is about to power down. 
1) Power down anyway. 



Printing errors are reported to the SoPEC CPU and Host Software running on the Host or SoPEC CPU 
will then decide what actions to take. 



Insufficient space available in SoPEC compressed band-store to download a band. 
1) Report to the Host PC. 

Insufficient ink to print. 
1) Report to Host PC. 

Page not downloaded in time while printing. 

1) Buffer underrun intemipt will occur. 

2) Report to Host PC and abort print ruiL 

JPEG decoder error interrupt. 
I) Report to Host PC. 



10.7.2 Printing errors 
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1 1 Central Processing Unit (CPU) 



11.1 



Overview 



The CPU block consists of the CPU core, MMU, cache and associated logic. The principal tasks for the 
program running on the CPU to fulfill in the system are: 

Communications: 

• Control the flow of data from the USB interface to the DRAM and ISI 



• Running the USB device driver 
PEP Subsystem Control: 

• Page and band header processing (may possibly be performed on host PC) 

• Configure printing options on a per band, per page, per job or per power cycle basis 

• Initiate page printing operation in the PEP subsystem 

• Retrieve dead nozzle information from the printhead interfece (PHI) and forward to the host PC 

• Select the appropriate firing pulse profile from a set of predefined profiles based on the printhead 
characteristics 

• Retrieve printhead temperature via the PHI 
Security: 

• Authenticate downloaded program code and printer operating parameters 

• Authenticate consumables via flie PRINTER^QA and INK_QA chips 
« Monitor ink usage 

• Isolation of OEM code from direct access to the system resources 
Other: 

• Drive the printer motors using the GPIO pins 

• Monitoring the status of the printer (paper jam, tray empty etc.) 

• Driving fi-ont panel LEDs 

• Perform post-boot initialisation of the SoPEC device 

• Memory management (likely to be in conjunction with the host PC) 

• Miscellaneous housekeeping tasks 

To control the Print Engine Pipeline the CPU is required to provide a level of performance at least equiva- 
lent to a 16-bit Hitachi H8-3664 microcontroller running at 16 MHz. An as yet imdetermined amount of 
additional CPU performance is needed to perform the other tasks. The extra perfonnance required is dom- 
inated by the signature verification task and the SCB (including the USB) management task. An operating 
system is not required at present A number of CPU cores have been evaluated and the LEON PI 754 is 
considered to be the most appropriate solution. A diagram of the CPU block is shown in Figure 1 5 below. 



I 



Communication with the host via USB or ISI 
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AHB Controller 



AHB Interface 



LEON Core 



CACHE 
&MMU 



Address 
Decoder 



Realtime 

Debug 

Unit 



nil 



cpu_adit21:0] 
cpu_<Jataout[31 :0] 

draxn^cpu.data[2S5K)] 

cpu_aju_rreq 

diu_cpu_rack 

dlu_cpu_rvalid 

q?u_diu_wreq 

d(u_cpu_waac 

cpu_diu_wvalld 
cpu_d(u_wmasl^1 :0] 

cpu_acode[1:0] 

cpu_rwn 

cpu_cpr_se1 

cpr_cpu_rdy 

cpr_cpu_data(3 1 :0] 

cpu_gpIo_sel 

gpio_cpu_rdy 

0Pio_cpu_data[31 :0] 

cpujcu_sel 

icu_cpu_fdy 

icu cpu.datarai :01 

Iss_cpu_dataf3 1 X)] 
cpu_pcu_sei 
pcu^cpu^rdy 
pcu.cpu.data(31 :0] 
cpu 8cb_sel 
sco_cpu_fay 
scb_cpu_data[3t .*0] 
cpu_tim_sel 
tim_cpu_rdy 
tint_cpu_datal3 1 :0] 
cpu_rom_sel 
r6m_cpu_rdy 
ro^.cpu_data[31 :0] 
cpu MS_sel 
pss_jcpu_rdy 
pss_cpu_ctata(31 :0] 

S>u_dlu_sel 
u_cpu_rdy 

d{u_cpu_data(3l X>] 

diu_cpu_berr 

pss_cpu_berr 

rofn_cpu_beiT 

tim_cptj_berr 

scb.cpu.berr 

pcu_cpu_berr 

iss_cpu_berr 

Icu^cpo.berr 

OPto_cpu_berr 

cpr_cpu_befr 

dlu_cpu_debug_vaJid 

tini_ci3u_d€bug_valid 

scb.cpii_debug_vaUd 

pcu.cpu.debug^vaJId 

tes_cpu_debuQ_vajld • 

icu_cpu_debug_valid 

Opio_cpu_debug_valld 

cpf_cpu_debug_valid 



debug_data_out[1 8.-0] 

debug_data.valid 

debug_cnui(19:0] 



prst_n 
pclk 

icu_cpuj[eve([3:0] 
cpu Jade 

cpujcujlevet[3:0] 



Figure 15. CPU block diagram 
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I 11.2 



DEFINITIONS OF i/OS 

Table 14. CPU Subsystem VOs 



mmmm 






Ctocks and Resets 


prst_n 


1 


In 


Global reset. Synchronous to pdk, active low. 


pclk 


1 


In 


Global clock 


CPU to DIU DRAM Interface 


cpu_adr(21 :0] 


22 


Out 


Address bus for both ORAM artd peripheraC access 


cpu_dataout[31 :0] 


32 


Out 


Data out to both DRAM and peripheral devices. This should be 
driven at the same time as the cpu_atirantA request signals. 


d ram_cpu.data(255:0] 


256 


In 


Read data from the DRAM 


cpu.diu^rreq 




Out 


Read request to the DIU DRAM 


dlu_cpu_rack 




In 


Acknowledge from DIU that read request has been accepled. 


diu.cpu.rvafid 




In 


Signal from DiU teUing SoPEC Unit that valid read data is on the 
drarn^cpu^data bus 


cpu_diu_wreq 




Out 


Write request to the DIU 


diu_cpu_wack 




In 


Acknowledge from the DIU that the write request has t>een 
accepted 


cpu_diu_wvalid 


1 


Out 


Signal from the CPU to the DtU indicating that the data currently on 
the cpu_dataout bus Is valid 


cpu.diu_wmask(1 :0] 


2 


Out 


Flag indicating format of CPU write to DRAM 
cpu_diu_wmaskss 00: d-bit write 
'(^_diu_wmasks 01 : 16-btt write 
cpu^dfu^wmasksi 10: 32-b(t write 
cpu^<^^wmasks 1 1 : reserved 

cpu.adr[2:0] are dtwen in accordance with the wMth of the data 
access indicated by cpu_dfu^wmask. Addresses cannot cross a 
256-bft word DRAM boundary. 


CPU to perfpheral lricx:ks 


cpu^rwn 


1 


Out 


Common read/not-write signal from the CPU 


cpu_aoode(1K)] 


2 


Out 


CPU access code signals. 

cpu_acode{0] - Program (0) / Data (1) access 

cpu_aoode[1] * User <0) /.Supervisor <1) access 


cpu_cpr_sel 


1 


Out 


CPR btock select 


cpr_cpu_rdy 


1 


In 


Ready signal to the CPU When cpr_qp6Lnay is high it indicates the 
last cyde ot the access. For a write cycle this means cpu^dataout 
has been registered by the CPR bk>ck and for a read cyde this 
means the data on cpr.qpf/_d!a2a is valid. 


cpr_cpu_berr 


1 


In 


CPR Ixis enor signal to the CPU. 


cpr.qju_data(31 K5J 


32 


In 


Read data bus from the CPR bfock 


cpu^plo.&el 


1 


Out 


QPIO block select 


Opio_cpu_fdy 


1 


In 


GPIO ready signal to the CPU. 


gpio_jcpu_berr 


1 


In 


GPfO bus enrof signal to the CPU. 


gpio_cpu_data[3l :0] 


32 


In 


Read data bus from the GPIO block 


cpujcu^sei 


1 


Out 


ICU block select. 


icu_cpu_rdy 


1 


In 


ICU ready signal to the CPU. 


«cu_cpu_berr 


1 


In 


ICU bus enor signal to the CPU. 


*cu_cpu_data{3l :0] 


32 


in 


Read data bus from the ICU block 
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Table 14. CPU Subsystem l/Os 







mm 




cpujss_sel 




Out 


Loo OlOCK select 


lss_cpu_rdy 




In 


Loo ready signal lo tne CPU. 


lss_cpu_berr 




In 


LSS bus error signal to the CPU. 


lss_cpu.data[31:0] 


32 


In 


Read data bus from the LSS block 


cpuj)cu_sel 


1 


Out 


PCU nock select 


pcu.cpu.rdy 


1 


In 


PCU ready signal to the CPU. 


pcu_cpu_befT 


1 


In 


PCU bus error signal to the CPU. 


pcu_cpu_data(31 :0J 


32 


In 


Read data bus from the PCU block 


cpu_scb_set 


1 


Out 


SCB btock select 


&cb_cpu_rdy 


1 


In 


SCB ready signal to the CPU. 


scb_cpu_berr 


1 


In 


SCB bus error signal to the CPU. 


scb_cpLf_data{31 .-0] 


32 


In 


Read data bus from the SCB block 


cpu_tim_sel 


.1 


Out 


Timers block select. 


tim_cpu_rdy 


1 


In 


Timers block ready signal to the CPU. 


tim_cpu_berr 


1 


In 


Tlcners bus error signaJ to the CPU. 


tim_cpti_data[31 :0] 


32 


In 


Read data bus from the Timers block 


cpu_rom_sel 




Out 


ROM block select 


rom_cpu_rdy 




In 


ROM bk>dc ready signal to the CPU. 


rom_cpu_berT 




In 


ROM bus error signal to the CPU. 


romjcpu_data(31 K>j 


32 


In 


Read data bus from the ROM block 


cpu_f)S8_8el 




Out 


PSS bk>ck select 


pss_cpu_fdy 




In 


PSS block ready signal to the CPU. 


pss_cpu_berr 




In 


PSS bus error signal to the CPU. 


pss_cpu.data[31 :0] 




In 


Read data bus from the PSS block 


cpu_diu_Gel 




Out 


OIU register tkxk select 


dtu_cpLi_rdy 




In 


DIU register block ready signal to the CPU. 


<liu_cpu_beir 




In 


DIU bus error signal to the CPU. 


diu_cpu_data(31 :0] 


32 


tn 


Read data bus from the OIU t>tock 


(ntemipt signals 


icu_cpuJlGveI(3:0) 


3 


In 


An Intenupt is asserted by drMng the appropriate priority level on 
icu^cpu_UevdL These signals must remain asserted untQ the CPU 
executes an Interrupt acknowledge cycle. 


cpu_icu.iJeveJ[3:0] 


3 


Out 


Indicates the level of the Interrupt the CPU Is acknowledging when 

qpu.^cfc^ishigh 


cpujack 


1 


Out 


Interrupt acknowledge signal. The exact timing depends on the 
CPU core implementation 


Debug signals 




dju.cpu_debug_valid 


1 


In 


Signal indicating the data on the diu^cpu^data bus is valid debug 
data. 


lim_cpu_debug_valid 


1 


In 


Signal indicating the data on the dm cpv data bus is valid debug 
data. 


«cb_cpu_debufl_valld 


1 


In 


Signai Indtoating the data on the scb cpu data bus is valkJ debug 
data. 
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Table 14. CPU Subsystem l/Os 











pcu_cpu_<lebug_valid 




In 


oiyiMu ifiuiMiuiiy uio udui vn irio pcu_cpu_ual3 oUS IS VQItG u6CXJQ 

data. 


tss_cpu_oepuo_,vaiKi 


1 


In 


Signal indicating the data on the lss_cpu_<iata bus is valid debug 
data. 


icu_cpu_debug_valid 


1 


In 


Signal indicating the data on the icu_cpu_data bus is valid detnjg 
data. 


gpto.cpu.debucuvalid 


1 


In 


Signal indicating the data on the gg^jcpajdata bus is valid debug 
data. 


cpr_cpujdebuo_vaIid 


1 


In 


Signal indicating the data on the cpr^cpujdata bus is valid debug 
data. 


debiigjdata_out 


18 


Out 


Output debug data to be muxed on to the PHI pins 


debugjdata.valid 


1 


Out 


Debug vaQd signal indicating the validity of the data on 

debuQ_data_out. This signal Is used in all detMjg configurations 


dsbug^cntrl 


20 


Out 


Control signal for each PHI bound debug data tine indicating 
whether or not the debug data should be selected by the pin mux 



11.3 Realtime requirements 



The SoPEC realtime requirements have yet to be fully detemiined but they may be split into three catego- 
ries: hard, firm and soft 



11.3.1 Hard realtime requirements 

Hard requirements are tasks that must be completed before a certain deadline or failure to do so will result 
in an error perceptible to the user (printing stops or functions incorrectly). There are three hard realtime 
tasks: 

• Motor control: The motors which feed the paper through the printer at a constant speed during 
printing are driven directly by the SoPEC device. Four periodic signals with different phase rela- 
tionships need to be generated to ensure the paper travels smoothly through the printer. The genera- 
tion of these signals is handled by the GPIO hardware (see section .13.2 for more details) but the 
CPU is responsible for enabling these signals (i.e. to start or stop the motors) and coordinating the 
movement of the paper with the printing operation of the printhead. 

• Buffer management: Data enters the SoPEC via the SCB at an uneven rate and is consumed by the 
PEP subsystem at a different rate. The CPU is responsible for managing the DRAM buffers to 
ensure that neither overrun nor undcrrun occur. This buffer management is likely to be performed 
under the direction of the host. 

• Band processing: In certain cases PEP registers may need to be updated between bands. As the tim- 
ing requirements are most likely too stringent to be met by direct CPU writes to the PCU a more 
likely scenario is that a set of shadow registers will programmed in the compressed page units 
before the current band is finished, copied to band related registers by the finished band signals and 
the processing of the next band will continue immediately. An alternative solution is that the CPU 
will construct a DRAM based set of commands (see section 21.8,5 for more details) that can be exe- 
cuted by the PCU. The task for the CPU here is to parse the band headers stored in DRAM and gen- 
erate a DRAM based set of commands for the next number of bands. The location of the DRAM 

. based set of commands must then be written to the PCU before the current band has been processed 
by the PEP subsystem. It is also conceivable (but currently considered unlikely) that the host PC 
could create the DRAM based commands. In this case the CPU will only be required to point the 
PCU to the correct location in DRAM to execute commands fi^m. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
^ Page 68 



SoPEC : Hardware Design 



11.3.2 Firm requirements 

Firm requirements are tasks that should be completed by a certain time or failure to do so will result in a 
degradation of performance but not an error. The majority of the CPU tasks for SoPEC fall into this cate- 
gory including all interactions with the QA chips, program authentication, page feeding, configuring PEP 
registers for a page or job, determining the firing pulse profile, communication of printer status to the host 
over the USB and the monitoring of ink usage. The authentication of downloaded programs and messages 
will be the most compute intensive operation the CPU will be required to perform. Initial investigations 
indicate that the LEON processor, nmning at 160 MHz, will easily perform three authentications in under 
a second. 



Table 15. Expected firm requirements 





POwer-on to start of printing first page (USB and slave SoPEC enumemtion, 3 or more 
RSA signature verificalions, code and compressed page data download and chip initialt- 
sation] 


- 8 sees ?? 


Wake-up from steep mode to start printing (3 or more SHA-1 operations, code and com- 
pressed page data download and chip re-lnltlallsation 


- 2 sees 


Authenticate ink usage In the printer 


~ 0.5 sees 


Oetemilning firing putse profile 


- 0.1 sees 


Page feeding, gap between pages 


OEM dependent 


Conrvnunlcatton of printer status to host PC 


- 10 ms 


Configuring PEP registers 


7? 



11.3.3 Soft requirements 

Soft requirements axe tasks that need to be done but there are only light time constraints on when they need 
to be done. These tasks are performed by the CPU when there are no pending higher priority tasks. As die 
SoPEC CPU is expected to be lighdy loaded these tasks will mostly be executed soon after tiiey are sched- 
uled. 

11.4 Bus Protocols 

As can be seen from Figure 15 above tiiere are different buses in the CPU block and different protocols are 
used for each bus. There are ttsrec buses in operation: 

11.4.1 CPU core to cache/MMU bus 

This is the native bus of the CPU core. See section 1 1.6.6. 1 for more details. Timing and full signal details 
should be provided in the documentation accompanying this core. 

1 1 .4.2 Cache/MMU to DIU bus 

This bus confonns to the DIU bus protocol described in Section 20.13.2. Note that the address and data 
buses are shared with the peripheral bus. The effective bus width differs between a read (256 bits) and a 
write (32/16/8 bits) and only the bottom 32 bits of the bus are shared with the peripheral bus. As certain 
CPU instructions may require byte write access this will need to be supported in the DIU. See section 
11.6.6.2 for more details. . 
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11.4.3 CPU Subsystem Bus 

For access to the on-chip peripherals a simple bus protocol is used. The MMU must first determine which 
particular block is being addressed (and that the access is a valid one) so that the appropriate block select 
signal can be generated During a write access CPU write data is driven out with the address and block 
select signals in the first cycle of an access. The addressed slave peripheral responds by asserting its ready 
signal indicating that it has registered the write data and the access can complete. The write data bus is 
I common to all peripherals and is also used for CPU writes to the embedded DRAM. A read access is initi- 

ated by driving the address and select signals during the first cycle of an access. The addressed slave 
responds by placing the read data on its bus and asserting its ready signal to indicate to the CPU that the 
read data is valid. Each block has a separate point-to-point data bus for read accesses to avoid the need for 
atri-stateablebus. 

All peripheral accesses are 32-bit. Support for byte or 16-bit accesses may be added if required by an 
imported IP block such as the USB controller. The use of the ready signal allows the accesses to be of vari- 
able length. In most cases accesses will complete in two cycles but three or four (or more) cycles accesses 
are likely for PEP blocks or IP blocks virith a different native bus interface. All PEP blocks are accessed via 
the PCU which acts as a bridge. The PCU bus uses a similar protocol to the CPU subsystem bus but with 
the PCU as the bus master. 

The duration of accesses to the PEP blocks is influenced by whether or not the PCU is executing com- 
mands from ORAM. As these conunands are essentijally register writes the CPU access will need to wait 
until the PCU bus becomes available when a register access has been completed. This could lead to the 
CPU being stalled for up to 4 cycles if it attempts to access PEP blocks while the PCU is executing a com- 
mand. The size and probability of this penalty is sufficiently small to have any significant impact on per- 
formance. 

In order to support user mode (i.e. OEM code) access to certain peripherals the CPU subsystem bus prop- 
agates the CPU function code signals icpu^acodefJ. OJ), These signals indicate the type of address space 
(i.e. User/Supervisor and Program/Data) being accessed by the CPU for each access. Each peripheral must 
determine whether or not the CPU is in the correct mode to be granted access to its registers and in some 
cases (e.g. Timers and GPIO blocks) different access permissions can apply to different registers within 
the block. If the CPU is not in the correct mode then the violation is flagged by asserting the block's bus 
error signal (block^cpujberr) with the same timing as its ready signal (filock^cpu _rdy) which lemains 
deasserted. When this occurs invalid read accesses should return 0 and write accesses should have no 
effect 

Figure 16 shows two examples of the peripheral bus protocol in action. A write to the LSS block from 
code running in supervisor mode is successfully completed This is immediately followed by a read from a 
PEP block via the PCU from code nmning in user mode. As this type of access is not permitted the access 
is terminated with a bus error. The bus error exception processing then starts directly after this - no further 
accesses to the peripheral should be required as the exception handler should be located in the DRAM. 

Each peripheral acts as a slave on the CPU subsystem bus and its behavior is described by the state 
machine in section 11.4.3.1 
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pclk 



cpu_a<Jr(21:0] b^^^ LSS address | PEP addfess|^^^ Supervisor stadj 



cpu.rwn I 



cpu_acode[1:0] [^^^ Supvr Data | User Data Supvr Data 

cpujss_sel I I 

lssjcpu_rdy | 



lss_cpu_berr 



cpu^pcu.sel j I 

pcu_cpu_berr I I 

pcu_cpu.rdy ' 

pcu_cpu_data[31:0] |^^^^^>^n^^>^^v^^>^^^ 0x0000,0000 | 

Figure 16. CPU bus transactions 



i i CPU subsystem bus siave state machine 

CPU subsystem bus slave operation is described by the state machine in Figure 17. This state machine 
will be implemented in each CPU subsystem bus slave. The only new signals mentioned here are the 
valid_access and reg_available signals. The valid_access is detennined by comparing the cpujacode 
value with the block or register (in the case of a block that allow user access on a per register basis such as 
the GPIO block) access permissions and asserting vaiidjaccess if the pennissions agree with the CPU 
mode. The r^_availab!e signal is only required in the PCU or in blocks that are not capable of two-cycle 
access (e.g. blocks containing imported IP with different bus protocols). In these blocks the reg_available 
signal is an internal signal used to insert wait states (by delaying the assertion of block^cpu^rdy) until tiie 
CPU bus slave interface can gain access to the register. 

When reading from a register that is less than 32 bits wide the CPU susystems bus slave should return 
zeroes on the unused upper bits of the block_cpu_data bus. 

To support debug mode the contents of the register selected for debug observation, debugjreg^ are always 
output on the block^cpu^data bus whenever a read access is not taking place. See section 1 1.8 for more 
details of debug operation. 
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Pfst n = Q 

WocK_cpu_fOy « 0 
biocK„cpu_befT = 0 
blocK_cpu_data = tfebuo_reg.dam 
block_cpu_debug.vaid e i 

CPU Mock set »«_Q 
btock_cp«_data = reg_<lata > 
btock_cpu_debt»g_val]d = " 



|jlocK.cpu 
btocK_cpu_aB a 



CPU btocK aal 
AND< 

AND reo avanable 

l>locK_^$u3>err».0 
btock_cpo_data = reg_data 
blociecpu.debua.vaiid ' 



block_cpu_r , 
Jjdebug_vali 
' debugLreg_data 



.valid 



Read Access 
Complete 



CPU Mock set« 
AND CPU fwn 1 
AND vafid acc<>ss n 

block $oZbe rroi 
block_cpu.data = 0x00000000 



bfock^CpiO>«r 



Invalid Read 
Access 




CPU btock set «o i 
AND CPU rwna^Q 

ANOvaW access 
IP reo availaWa =^ i 
block.cpu_rdy « 1 
bk)clCqp»uD3«nr»0 
reg_(lata « cpu^dataoui 




AND yard access 



ocK_cpu^dy sx 1 
I (data e 



tiodQ.cpu_data e reo_data 




Complete 7 wocK.cpii_rt r 



ANPcpu rYvn«^ .Q 

AND valid access =rgQ 

btock^cpu^rdy = 0 
blocH_cpuJbefr = 1 



btocK-CpULberr e o 



Figure 17. State machine for a CPU subsystem slave 



11.5 LEON CPU 



The LEON processor is an open-source implementation of the IEEE- 1754 standard (SPARC V8) instruc- 
tion set. LEON is available from and actively supported by Gaisler Research (www.gaisler.com). 

The following features of the LEON-2 processor will be utilised on SoPEC: 

• IEEE- 1754 (SPARC V8) compatible integer unit with 5-stage pipeline 

• Separate Instruction and data cache (Harvard architecure) 

• Set-associative caches: 1-4 sets, 1-64 kbyte/set. Random, LRR or LRU replacement. Direct 
mapped cacches are also available and arc the more likely option for SoPEC. 

• Full implementation of AMBA-2.0 AHB on-chip bus 

• Power-down mode 

The standard release of LEON incorporates a number of peripherals and support blocks which will not be 
included on SoPEC. The LEON core as used on SoPEC will consist of: 1) the LEON integer unit, 2) pos- 
sibly the instruction and data caches (currently under review), 3) the cadie control logic (to be signifi- 
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cantly reduced by optimisation if the caches are not used). 4) the AHB interface and 5) possibly the AHB 
controller (although this ftinctionality may be implemented in the LEON Bridge). 

The version of the LEON database that the SoPEC LEON components will be sourced from is LEON2- 
1 .0.8 although later versions may be used if they offer worthwhile functionality or bug fixes that affect the 
SoPEC design. Note that if the LEON caches are not used then we may revert to vL0,7 of the database as 
the cache control logic is likely to be simpler and easier to optimise away (vl.0.8 introduced support for 

set-associative caching) 

The LEON core will be clocked using the system clock, pc/A, and reset using the prst_n_;section[l] signal. 
The ICU will assert all the hardware interrupts using the protocol described in section 11. 9. The particular 
types of SRAMs (for LEON caches) and register files used will be determined during the implementation 
phase. The LEON hardware multipliers are notexpected to be required. Furthermore it is anticipated that 
SoPEC will use the recommended 8 register window configuration 

Further details of the SPARC V8 instruction set and the LEON processor can be found in [32] and [33] 
respectively. 



1 1 .6 Memory Management Unit (MMU) 

Memory Management Units are typically used to protect certain regions of memory from invalid accesses, 
to perform address translation for a virtual memory system and to maintain memory page status (swapped- 
in, swapped-out or unm^ped) 

The SoPEC MMU is a much simpler affair whose function is to ensure that all regions of the SoPEC mem- 
ory map are adequately protected. The MMU does not support virtual memory and physical addresses are 
used at all times - the one exception to this is the address translation of the reset vector. The SoPEC MMU 
supports a full 32-bit address space, A proposed memory map is shown in Figure 18 below. 

The MMU selects the relevant bus protocol and generate the appropriate control signals depending on the 
area of memoiy being accessed. The MMU is responsible for performing the address decode and genera- 
tion of the appropriate block select signal as well as the selection of the correct block read bus during a 
read access. The MMU will need to stq>port all of the bus transactions the CPU can produce including 
interrupt acknowledge cycles, aborted transactions etc. 

When an MMU error occurs (such as an attempt to access a- supervisor mode only region when in user 
mode) a bus error is generated. While the LEON can recognise different types of bus error (e.g. data store 
error, instruction access error) it appears to handle them in the same manner as it handles all tcdcgs i.e it will 
transfer control to a trap handler. No extra state information appears to be stored because of the nature of 
the trap.The location of the trap handler is contained in the TBR (Trap Base Register). This is the same 
mechanism as is used to handle intemq>ts. Further investigation is needed to detcnnine exactly how LEON 
behaves when a bus error type tn^ occurs to determine the best approach to handling bus errors. It may be 
simplest to just treat them as the highest priority interrupt 
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Accesses in this 
area are not 
allowed and 
result in a bus 
error exception. 



Accesses in this 
area are via the 
CPU bus and are 
controlled by 
permissions set in ^ 
each peripheral. 



Accesses in this 
area are via the 
DIU bus and are 
controlled by 
permissions set in^ 
the MMU. 




OxFFFF_FFFF 



PCU Mapped Registers 



Peripheral Registers 



ROM 



DRAM 



Ox002A^CO00 
Ox002A_0000 
0x0029^0000 
0x0028.0000 




ORAM 
Regions 



0x0000.0000 



Figure 18. Proposed SoPEC CPU memory map (not to scale) 

1 1.6-1 CPU-bus peripherals address map 

- The address mapping for the peripherals attached to the CPU-bus is shown in Table 16 below. The MMU 
performs the decode of the high order bits to generate the relevant cpuJblock_select signal. Apart from the 
PCU, which decodes the address space for the PEP blocks, each block only needs to decode as many bits 
of cpu_adr[U:2] as required to address all the registers widiin the block. 



Table 16. CPU-bus peripherals address map 





MMU_base 


0x0029_0000 


TIM_base 


0x0029.1000 


LSS_base 


0x0029.2000 


GPIO_t>ase 


0x0029_3000 


SCB^base 


0x0029_4000 


ICU.base 


0x0029.5000 


CPR_base 


0x0029.6000 


ROM^base 


0x0029.7000 


DIU.base 


0x0029.8000 


PSS_base 


0x0029.9000 
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Table 16. CPU-bus peripherals address map 









Reserved 


0x0029^000 to (M029 JFFFF 


PCU.base 


Ox002AjOOOO 



11.6.2 DRAM Region Mapping 

The embedded DRAM is broken into 8 regions, with each region defined by a lower and upper bound 
address and with its own access pennissions. 

The association of an area in the DRAM address space with a MMU region is completely under software 
control. Table 17 below gives one possible region mapping. Regions should be defined according to their 
access requirements and position in memory. Regions that share the same access requirements and diat are 
contiguous in memory may be combined into a single region. The example below is purely for indicative 
purposes - real mappings are likely to differ significantly firom this. Note that the RegionBottom and Regi- 
onTop fields in this example are byte aligned and would need to be right-shifted by 5 places to obtain the 
256-bit aligned value used to program the RegionNTop and RegionNBottom registeis. or more details see 
11.6.5.1 and 11.6.5.2. 



Table 17. Example region mapping 













0 


0x0000.0000 


OxOOOO.OFFF 


Sifverbrook OS (supervisor) data 


1 


0x0000.1000 


OxOOOO.BFFF 


Sitvertorook OS (supervisor) code 


2 


OxOOOO.COOO 


OxO0O0_C3FF 


Silverbrook (9upervisof/user) data 


3 


OxOOOO.C400 


OxOOOO.CFFF 


Sttverbrook (supervisorAiser) code 


4 


0x0026.0000 


0x0026.O3FF 


OEM (user) data 


S 


QX0026.D400 


0x0026.Df=FF 


OEM (user) code 


6 


0XOO27.E000 


0x0027_FFFF 


Shared SitveitrooK/OEM space 


7 


OxOOOO.OOOO 


0x0026_CFFF 


Compressed page store (supervisor data) 



11.6.3 Non-DRAM regions 

As shown in Figure 18 the DRAM occupies only 2.5 KfBytes of the total 4 GB SoPEC address space. The 
non-DRAM regions of SoPEC are handled by the MMU as follows: 

ROM (0x0028.0000 to Ox0028.FFFF): The ROM block will control the access types aUowed. The 
cpu_acode[l:0] signals will indicate (he CPU mode and access type and the ROM block will assert 
rom_cpuJ>err if an attempted access is forbidden. The protocol is described in more detail in section 
1 1.4.3- The ROM block access pennissions are hard wired to allow all read accesses except to the Fuse- 
ChipID registers which may only be read in supervisor mode. 

MMU Internal Registers (0x0029.0000 to 0x0029_0FFF): The MMU is responsible for controlling the 
accesses to its own internal registers and will only allow data reads and writes (no instruction fetches) 
from supervisor data space. All odier accesses will result in the mmu^cpujberr signal being asserted in 
accordance with the CPU native bus protocol. 

CPU Subsystem Peripheral Registers {0x0029_1000 to 0x0029.FFFF): Each peripheral block will 
control the access types allowed. Every peripheral will allow supervisor data accesses (both read and 
vmte) and some blocks (e.g. Timers and GPIO) will also allow user data space accesses as outlined in the 
relevant chapters of this specification. Neither supervisor nor user instruction fetch accesses are allowed to 
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any block as it is not possible to execute code from peripheral registers. The bus protocol is described in 
section 11.4.3. 

PCU Mapped Registers (Ox002A^OOOO to Ox002AJFFF): All of the PEP blocks registers which are 
accessed by the CPU via the PCU will inherit the access permissions of the PCU. These access permis- 
sions are hard wired to allow supervisor data accesses only and the protocol used is the same as for the 
CPU peripherals. 

Unused address space (OxOO2A_C00O to OxFFFF^FFFF): All accesses to the unused portion of the 
address space will result in the mmu_cpuj>err signal being asserted in accordance with the CPU native 
bus protocol. These accesses will not propagate outside of the MMU i.e. no extemal access will be initi- 
ated. 

1 1 .6.4 Reset exception vector and reference zero traps 

When a reset occurs the LEON processor starts executing code from address 0x0000.0000. On SoPEC the 
embedded DRAM occupies this area of the address map. As the DRAM contents are undefined when the 
processor comes out of reset (this is certainly the case with a power-on and most other resets that can occur 
on SoPEC) the MMU wiU need to redirect accesses from 0x0000.0000 through OxOOOO_00?? (the mini- 
mum amount of redirection is currently TBD but is likely to be at least 16 bytes) to the bottom of the ROM 
i.e. to 0x0028.0000 through 0x0028.00??. 

A common software bug is zero-referencing or null pointer de-referencing (where the program attempts to 
access the contents of address OxOOOO.OOOO). To assist software debug the MMU will assert a bus error 
every time the reset locations are accessed after the reset trap handler has legitimately been retrieved 
inmiediately after reset. If desired this condition could be result in a unique trap (e.g. a watchpoint 
detected trap) 

11.6.5 MMU Configuration Registers 

These are die only configuration registers in the CPU block. Note that all the MMU configuration registers 
may only be accessed when the CPU is running in supervisor mode. 



Table 18. MMU Conflguratlon Registers 




0x04 



RegionOBottom 



RegionOTop 



17 



0x0.0000 



OxF^FFFF 



This register contains the physical address that 
marks the trattom of region 0 



This register contains the physical address that 
marks the top of region 0. Region 0 covers the 
entire address space after reset whereas aJI 
other regions are zero-sized initially. 



0x08 



Regfoni Bottom 



17 



0x0^0000 



This register contains the physical address ttiat 
_rnarks the bottom of region 1 



OxOC 



RegionlTop 



17 



0x0^0000 



This register contains the physical address that 
marks the top of region 1 



0x10 



Region2Bottom 



17 



0x0.0000 



This register contains the physical address that 
marks the tx>ttom of region 2 



0x14 



Region3Top 



17 



0x0.0000 



This register contains the physical address that 
marks the top of regton 2 



0x18 



0x1 C 



ReglonSBottom 



17 



OxO^OOOO 



This register contains the physical address that 
marks the bottom of region 3 



RegionSTop 



17 



OxO^OOOO 



This register contains the physical address that 
marks the top of region 3 
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Table 18. MMU Configuration Registers 

















^^^^ 














0x20 


R6aion4Bottnm 


17 




1 iiio tcgiaier coniaJns irtc pnysiCai aOOicSS uiBi 

marks the bottom of region 4 


0x24 


Re9ion4Top 


17 


0x0.0000 


This register contains the physical address that 
nnarlcs the top of region 4 


0x28 


RegionSBottom 


17 


0x0.0000 


This register contains the physical address that 
marks the t)ottom of region 5 




RegionSTop 


17 


0x0^0000 


This register contains the physical address that 

niAfkQ thn tnn nf r<viinn ^ 


0x30 


ReglonSBottOfT) 


17 


0x0 0000 


niarks the bottom of regton 6 


0x34 


Region6Top 


17 


0x0_0000 


This register contains the physical address that 
marks the top of region 6 


0x38 




17 


UXII^UUUU 


1 nis regisrer coniains uie pnysicai aciuress ttiat 
marks the bottom of region 7 


0x3C 


RegionTTop 


17 


0x0.0000 


This register contains the physical address that 
marks the top of region 7 


0x40 


RegionOCcntfol 


6 


0x07 


Control register for re^on 0 


0x44 


Region 1 Control 


6 


0x07 


Control register Ibr region 1 


0x48 


RegIon2Control 


6 


0x07 


Control register for region 2 


0x4C 


RegionSControl 


6 


0x07 


Control register for region 3 


OxSO 


Reglon4Control 


6 


0x07 


Control register for region 4 


0x54 


RegionSControl 


6 


0x07 


Control register for region 5 


0x58 


RegionaControl 


6 


0x07 


Control register for region 6 


0x5C 


Reglon7Control 


6 


0x07 


Control register for region 7 


0x60 


BusTimeout 


16 


OxOOFF 


This register should be set to the number of pcik 
cycles to wait before aborting an access with a 
bus error. 


0x64 


DebugSelect 


7 


0x00 


Contains address of the register selected for 
debug observatton. It is expected ttiat a numt)er 
of pseudo-registers wiU be made available for 
debug observation and these will be outlined 
dudrtg the implementation phase. 



11.6.5,1 RegionTop and RegionBottom registers 

The 20 Mbit of embedded DRAM on SoPEC is arranged as 81920 words of .256 bits each. All region 
boundaries need to align with a 256-bit word. Thus only 17 bits are required for the RegionNTop and 
RegionNBottom registers. The byte address of these locations can be obtained by simply left-shifting the 
register value by 5 bits i.e. cpu_adr[21:0] = RegionNTop/BottomfI6:0J « 5. 

Both the RegionNTop and RegionNBottom registers are inclusive i.e. the addresses in the registers are 
included in the region. The size of smallest active region is therefore 2 256-bit words i.e. 64 bytes. 

If DRAM regions overlap (there is no reason for this to be the case but there is nothing to prohibit it either) 
then only accesses allowed by all overlapping regions are permitted. That is if a DRAM address appears in 
both Region 1 and Region3 (for example) the cpu^acode of an access is checked against the access permis- 
sions of both regions. If both regions permit the access then it will proceed but if either or both regions do 
' not permit the access then it will not be allowed. 
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The MMU does not support negatively sized regions i.e. the value of the RegionNTop register should 
always be greater that the value of the RegionNBottom register. If RegionNTop is lower in the address map 
than RegionNTop then the region is considered to be zero-sized and is ignored. 

When both the RegionNTop and RegionNBottom registers for a region contain the same value the region is 
then simply one 256-bit word in length and this corresponds to the smallest possible active region. 

ft. 6.5.2 Region Control registers 

Each memoiy region has a control register associated with it. The RegionNControl register is used to set 
the access conditions for the memory region boimded by the RegionNTop and RegionNBottom registers. 
Table 19 describes the function of each bit field in the RegionNControl registers. All bits in a RegionNCon- 
trol register are both readable and writable by design. However, like all registers in the MMU, the 
RegionNControl registers can only be accessed by code running in supervisor mode. 



Table 19. Region Control Register 







SupervisorAccess 


2:0 


Denotes the type of access allowed when the CPU is running in 
Supervisor mode. For each access type a 1 indicates the access is 
permitted and a 0 indicates the access is not permitted, 
biio - Data read access permission 
biti - Data write access permission 
bii2 - Instruction fetch access permission 


UserAocess 


5:3 


Denotes the type of access allowed when the CPU is running in 
User mode. For each access type a 1 indicates the access is per- 
mitted and a 0 indicates the access is not permitted. 
blt3 - Data read access permission 
bit4 - Data write access permission 
bits - Instruction fetch access permission 



tt. 6.5.3 Status Register 

The SPARC V8 architecture allows for a number of types of memory access error to be trapped. These trap 
types and trap handling in general are described in chapter 7 of the SPARC architecture manual [32], 
According to the SPARC architecture manual the processor will automatically move to the next register 
window (i.e. it decrements the current window pointer) and copies the program counters (PC and nPC) to 
two local registers in the new window. The supervisor bit in the PSR is also set and the PSR can be saved 
to another local register by the trap handler (this does not happen automatically in hardware). 

At the time of writing it is not clear whether the LEON core can easily accept memoiy access error trap 
types (i.e. the 8-bit tt field of the Trap Base register). Further investigation is needed to determine it this is 
possible and if existing trap types will cover the different types of bus error possible on SoPEC. Up to 32 
implementation specific trap types are allowed so conditions unique to SoPEC can be handled in this man- 
ner. 

If it is not possible for sufficient information about the cause of the bus error to be passed to the LEON 
core using the above mechanisms then a status register will be implemented to record the relevant informa- 
tion. 

11.6.6 MMU Sub-block partition 

As can be seen from Figure 19 and Figure 20 the MMU consists of five principal sub-blocks. For clarity 
the connections between these sub-blocks and other SoPEC blocks and between each of the sub-blocks are 
shown in two separate diagrams. 
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Figure 19. MMU Sub-block partition, external signal view 
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dranijdatafaiiO] 



Figure 20. MMU Sub-block partition, internal signal view 

11.6.6.1 LEON Bridge 

At the time of writing it is expected that the LEON core will be used with its AHB interface rather than be 
modified to comply with the protocols iised on SoPEC, in particular the DIU protocol for DRAM access. 
The LEON bridge consists of an AHB bridge and some glue logic. The AHB bridge will comreit between 
the AHB and the DIU and CPU subsystem bus protocols. The AHB bridge will always be a slave on the 
AHB. Glue logic will be required to assist with endianness coherency^ interrupts and other miscellaneous 
signalling. 



Table 20. LEON bridge l/Os 




Global SoPEC signals 



prst_n 


1 


(n 


Global reset Synchronous to pcft, active low. 


pdk 


1 


fn 


Global dock 


LEON Bridge to AHB signals 


haddit3l:0] 


32 


In 


AHB address bus 


hwdata[3l.-0j 


32 


In 


AHB write data bus 


hrclata[31:0] 


32 


Out 


AHB read data bus 


hsel 


1 


In 


AHB stave select signal 
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Tabfe 20. LEON bridge VOs 









hwrite 


1 


In 


AHB write signal: 
1 - Write access 
0 - Read access 


htrans 


2 


In 


Indicates the type of the current transfer: 
00 -IDLE 
01 • BUSY 

10- NONSEQ 

11- SEQ 


hsfze 


3 


In 


Indicates the size of the cun-ent transfer: 

000 - Byte transfer 

001 - HaJfword transfer 

010 - Word transfer 

01 1 • 64-brt transfer (unsupported?) 
Ixx - Urtsupported larger wordsizes 


hburst 


3 


In 


Indicates If thA current tiansfer farms nart of a hi irct anH Mia turwa Af 

burst 

000. SINGLE 
001 . INCR 
010 - WnAP4 
011-INCR4 
100 -WRAPS 
101 • INCR8 

110- WRAP16 

111- INCR16 


hprot 


4 


In 


Protection control signaJs pertaining to the current access: 
hpn>t(0] - Opcode(0} / Oata(1) access 
hprot[1] ' User(0) / Supervisor access 

hprot[2] * Non-bufferabIe(0) / Bufferable(l) access (unsupported) 
hprot[3] - Non-cacheable(O) /Cacheable access 


hmaster 


4 


In 


Indicates the identity of the current bus master. This wilt atways be 
the LEON core. 


hmastiock 


1 


In 


Indicates that the current master Is performing a locked sequence 
of transfers. 


hready 


1 


Out 


Active high ready signal indicating the access has completed 


hresp 


2 


Out 


Indicates the status of the transfer: 
00 -OKAY 
01 - ERROR 
10 -RETRY 
1 1 - SPUT 


hsplit 


16 


Out 


This 16-bit split bus is used by a slave to indicate to the eMtsr 
wtiich bus masters should be allowed attempt a spilt transaction. 
This feature wiQ be unsupported on the AHB bridge 


Toptevel/ Common LEON bridge signals 


cpu.dataout(31 K)] 


32 


Out 


Data out bus to both DRAM and peripheral devices. 


cpu_fwn 


1 


Out 


Read/NoAA/rite signal. 1 = Current access is a read access. 0 = 
Current access Is a write access 


icu_cpu_flevel(3:0j 


4 


In 


An interrupt is asserted by driving the appropriate priority fevel on 
icu^cpujievei. These signals must remain asserted until the CPU 
executes an interrupt acknowledge cycle. 


cp«-toujlevel[3:0j 


4 


In 


Indicates the level of the interrupt the CPU is acknowledging when 
cpu^iackls high 
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Table 20. LEON bridge l/Os 











cpu.iacK 


1 


Out 


Interrupt acknowledge signal. The exact timing depends on the 
CPU core ImplementaUon 


CDU Start access 


1 


Out 


Start Access slgnaJ Indicating the start of a data transfer and that 
the cpu^adr, cpu^dataout, cpu^rwnand cpu^acode signals are afl 
valid. This signal Is only asserted during the first cyde of an access. 


cpu.ben[1 :0] 


2 


Out 


Byte enable signals. 


LEON cora to LEON 


bridge signals 


iuMrl 


4 


Out 


Intemjpt level request to the LEON Integer Unit 


iuoJrl 


4 


In 


Acknowledged interrupt level fronfi the LEON Integer Unit 


Kjo.intack 


1 


In 


Interrupt acknowledge signal from the LEON Integer Unit 


LEON bridge to MMU 


Contror Block signals 


cpu_mmu_adr 


32 


Out 


CPU Address Bus. 


mmu_cpu_data 


32 


In 


Data bus from the MMU 


mmu_cpu_rdy 


1 


In 


Ready signal from the MMU 


cj>u_mfnu_acode 


2 


Out 


Access code signals to the MMU 


nrimu_cpu_berr 


1 


In 


Bus error signal from the MMU 



Description: 

The L^ON bridge must ensure that all CPU bus and interrupt transactions are functionally correct and that 
the timing requirements are met This sub-block is also responsible for ensuring endianness coherency i.e, 
guaranteeing that the correct data appears in the correct position on the data buses {hrdata, cpu^dataout 
and mmu_cpu_data) for every type of access. This is a requirement because the LEON uses big-endian 
addressing while the rest of SoPEC is little-endian. 

~) 

It is expected that some signals (especially those external to the CPU block) will need to be registered here 
to meet the timing requirements. Careful thought will be required to ensure that overall CPU access times 
are not excessively degraded by the use of too many register stages. 

11,6.6.2 DIU Bus interface 

The DIU bus interface will handle all valid accesses to the embedded DRAM via the DIU. The DIU bus 
interface ensures that the access conforms to the DHJ bus protocol while the DIU manages the arbitration 
and data alignment. 



Table 21. OIU Bus Interface l/Os 



mrmmfm 


IK? 


wm 




Global SoPEC signals 


prst_n 


1 


In 


Global reset. Synchronous to pcffc, acth^e low. 


pdk 


1 


In 


Global dock 


Toplevel/Common DIU Bus Interface signals 


dram_cpu,data[255K)J 


256 


In 


Read data from the DRAM. 


cpu_diu_rreq 


1 


Out 


Read request to the DIU DFRAM 


diu^cpu.radc 


1 


In 


Acknowledge from DIU that read request has been accepted. 


dio_cpu_rvalld 


1 


In 


Signal from DIU indicating that valki read data is on the 
dram^qju^data bus 


cpu_diu_wreq 


1 


Out 


Write request to the DIU 
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Table 21. DIU Bus Interface I/Os 







Sim 






1 


in 


Acknowledge from the DIU that the write request has been 
accepted 


cpu.diu.wvaiid 


1 


Out 


Signal from the CPU to the DIU indicating that the data cun'ently on 
the qpi/_dataouf bus is vaiki 


cpu_diu_wmask[1 :0] 


2 


Out 


Flag Indicating format of CPU write to DRAM. These signals are 

directly derived from the cpu.ben signals 

cpu^diuLwmask » 00: 8-btt write 

cpu^diu^wmasksQV. 16-bit write 

cpu^diu^wmask^ 10: 32-bit write 

cpu^dfu_\Ymask = 1 1 : reserved 

cpu_adr(2:0] are driven in accoidance with the width of the data 
access indicated by cpu_dfu_wmask. Addresses cannot cross a 
256-bit word DRAM boundary. 


dranrL_rdy 


1 


Out 


Data Ready signal. Indicates the data on the dram_cpu^data bus is 
valid far a read cycle or that the data was successfully dispatched 
to the DIU fbr a write cyde. 


DIU Bus Interface to MMU Control Block sFgnals 


cpu_adff21:0] 


22 


In 


Toplsvel CPU Address bus. 


dram_data(31X)] 


32 


Out 


Data bus containing the 32 bits addressed by cpu adr{4:2} from the 
256-bft DRAM read bus dram_cpu_data 


dram.aooess.en 


1 


In 


Enable Access signal, A DRAM access cannot be initiated unless it 
has been enabled by the MMU Control Unit 


DIU Bus biterface to ICache signals 


ic_cache_hft 


1 


In 


Cache hit signal from the ICache. This indicates that the current 
CPU read request is being serviced by the ICache and so should 
not be retrieved from the DRAM. 


DIU Bus Interface to LEON bridge signals 


cpu.ben[1:0) 


2 


In 


Byte enable signals from the LEON bridge. These are fonAretrded on 
to the DIU as the qpu_d/tJLvvmasA^ signals 


cpu.start^access 


1 


In 


Start Access signal from the LEON bridge indicating the start of a 
data transfer and that the cpu^adr, cpujdataout cpu_rwn and 
cpu^acode signals are all valW. This signal is only asserted during 
the first cyde of an access. 



Description: 

The DIU Bus Interface handles all data transfers between the CPU (or ICache) and the DIU. This involves 
translating between the different protocols used on the DIU and CPU buses. The validity (i e is the CPU 
running in the correct mode for the address space being accessed) of an access is determined by the MMU 

n "^^^^ ""^^ ""^"^^ ^ "^^^ ^""^ ^^^^^ ^ 256-bit boundary (as required by 

the DIU) and the dram^access^en is asserted if it is a valid access. Invalid accesses do not initiate DRAM 
accesses The operation of the DIU Bus Interface is described by the state machine shown in Figure 21 and 
the DIU bus protocol is described in more detail in section 20.9. The DIU will return a 256.bit dataword 
on dram^cpu_data[255:0} for every read access. The DIU Bus Interface must select the appropriate 32-bit 
word from this according to the word address given by cpu_adr[4:2]. 
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SI 



Pfst n ' 



cpu_<fiu_rreq a 0 
cpu_diu_wreq = 0 
cpu^diu.wvalid = 0 
draTurdy s 0 



CPU start access e«0 



pu start access 
•^dfam access en 



start accfti;s =a 1 



dram acoBss en c« 1 

ANPta cactifl m=-] 




CPU start access = 1 
AND dmm access en 1 
ANDk: cach« hit =^^ 

AND CPU mn 1 



diu CPU racket 
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configure data select muxes 
aoocxdlng to cpu.adr(4:2) 



f Read Access^ 
\^cknowiedgey 



diu coii n/afid ^ 1 
, ^ dram^rdy ss i 
.c ila a drain.cpu.da&[n:n-31J 



diu cpujnatgLas 

dram_rdy = 



I Read AcccssS 
Complete J 



CPU start access 1 

AMP.tfram access en = i 

ANDic cache hit«Q 
ANDcnu fwn=30 
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/w rite AccessN 
V Initiated J 



diu-CDu. 

cpujdiu.wreq » 0 



/write AccessN 
VAcknowIedge^ 



dm cnti wvaltd ^ 1^ 
dram^rdy = 1 



/write AccessN 
Complete ^ 



dhj CPU MA/alM . 



Figure 21. DIU Bus Interface state machine 



CPU Subsystem Bus interface 

The CPU Subsystem Interface block handles all valid accesses to the peripheral blocks that comprise the 
CPU Subsystem. 



Table 22. CPU Subsystem Bus Interfece l/Os 











Global SoPEC signals 


prsl_n 


1 


In 


Global reset. Synchronous to pdk, active low. 


pdk 


1 


In 


Global dock 


Toptevet/Common CPU Subsystem Bus Interface signals 


cpu_cpr_sel 


1 


Out 


CPR block select 


cpu_gpio_sel 


1 


Out 


GPIO block select. 


cpu_fcu_sel 


1 


Out 


ICU block select 


cpujss^sel 


1 


Out 


LSS block select. 


cpu_pcu_sel 


1 


Out 


PCU block select. 
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Table 22. CPU Subsystem Bus Interface l/Os 











cpu_scb.sel 




Out 




cpu_Jlm_8eJ 


1 


Out 


Timers block select. 


cpu_fom_sel 


1 


Out 


ROM block select. 


cpujjss_sel 


1 


Out 


PSS block select. 


cpu_ciiu_sel 




Out 


DIU block select 


cpr_cpu_data(31:0] 


32 


In 


Read data bus from the CPR bk>ck 


gpio_cpu_datat31 :0] 


32 


in 


Read data bus from the GPIO block 


^cu.cpu.data(31:0] 


32 


In 


Read data bus from the ICU block 


tes_cpu_data(31:0I 


32 


In 


Read data bus from the LSS block 


pcu_cpu_dataI31:0J 


32 


In 


Read data bus from the PCU block 


scb_cpu_data[31:0J 


32 


In 


Read data bus from the SCB block 


tim_cpu_data[31:0] 


32 


In 


Read data bus from the Timers block 


rom_cpu_data[31 K>] 


32 


In 


Read data bus from the ROM bk>ck 


pss_cpu_data(31 :0) 


32 


In 


Read data bus from the PSS bk>ck 


diu_cpu_data(31:0J 


32 


In 


Read data bus from the DIU block 


cpr_cpu_rdy 


1 


In 


Ready Slonaf to the CPU When i^r r^n rrivfla hi^K :* at. 

^mm^j wi^ftiw wfl- w. vviicii %ifjf_cpu^fuy \s u\^v\ ft inoicates the 
last cycle of the access. For a wrfto cyde this means cpu dataout 
has been registered by the CPR block and for a read cyde this 
means the data on cpr^cpu^datB is vafid. 


gpio_cpu_rdy 




In 


GPlO ready signal to the CPU. 


fcu_cpu_rdy 




In 


■w ftseiuy 9ignai lO uie OrU. 


lss_cpu_rdy 




In 




pcu_cpu_rdy 




In 


leauy signal TO ine Q/PU. 


scb_cpu_rdy 




In 


SCR TAaHu Rtnnal t/\ *Ka /^DI I 
y^s^o tcauy ofynal TO uie ^"U, 


tim_cpu_rdy 




tn 


Timers block ready. signal to the CPU, 


rom_cpu_rdy 




In 


• iwivi uiuciv reaay signal TO me CPU. 


pss_cpu_fdy 




In 


uiut^ reaay signal lO OTG CPU. 


dtu_cpu_rdy 




In 


DIU register bJock ready signal to the CPU. 


cpr_cpu_berr 




In 


Bus Error signal from the CPR block 






In 


Bus Error signal from the GPIO block 


icu_cpu_berr 




In 


Bus Error signal from the IQU block 


lss__cpu^be rr 




In 


Bus Error signal from the LSS block 


pcu„cpu_berr 




Jn 


Bus Error signal from the PCU btock 


scb_cpu_berr 




In 


Bus Error signal from the SCB block 


tim_cpu_berr 




In 


Bus Error signal from the Timers block 


rom_cpu_berr 




In 


Bus Error signal from the ROM block 


pss_cpu_berr 




In 


Bus Error signal from the PSS bkx:k 


diu_cpu_berr 




In 


Bus Error signal from the DIU Wock 


CPU Subsystem Bus Ir 


terface to MMU Control Block signals 


cpu_adr[19:12] 


6 


In 


Topfevel CPU Address bus. Only bits 19-12 are required to decode 
the peripherals address space 


peri.access.en 


1 


In 


Enable Access signal. A peripheral access cannot be initiated 
unless It has been enabled by the MMU Control Unit 
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Table 22. CPU Subsystem Bus Interface l/Os 

















uaia t)us Trom the selected peripheral 


peri_mnuj_rdy 


1 


Out 


Data Ready signal. Indicates the data on the peri mmu data bus Is 
valid for a read cyde or that the data was successfully written to the 
peripheral for a write cyde. 


perl_mmu_berr 
CPU Subsystem Bu8 


1 

Interface t 


Out 
o LEON br 


Bus Error signal. Indicates a bus error has occurred in accessing 
the selected peripheral 


cpu.stait^aocass 


1 


In 


Idge signals 

Start Access signal from the LEON bridge indicating the start of a 
data transfer and that the cpu^adr, cpujeSataout, cpvLrwn and 
cpu^acode signals are alf valid. This signal is only asserted durina 
the first cyde of an access. 



Description: 



// The peri_acces8__en signal will have the 
/ / timing required for block selects 



IrZ 'a A ""'^^ P*"**™* simple address decoding to select a peripheral and mul- 

r. '^J'^l' ^^"^^ *° configuration registers are handled 

f^Jrf,^ S Block rather than the CPU Subsystem Bus Interface block. The CPU Subsystem Bus 
Interface block operation is described by the following pseudocode: 

niasked_cpu_adr = cpu_adr J 19 : 12] 
case (n>asked_cpu^adr) 
when TIM_baso(19:12J 

cpu_tiin_sel = peri_accefis_en 

peri_jnmu_data = t ijn_cpu_data 

peri_pranu_rdy = tim_cpu_rdy 

peri_jnmu_berr = tim_cpu_berr 

«ll_other_8elects = 0 // shorthand to ensure other epu_baocK_sel sianals 

// remain deasserted 

when LSS_basef 19: 12J 

cpu_lss_sel « peri_access_en 

perijTanu_data = Iss_cpu_data 

peri_inmu_rdy s lss_cpu_rdy 

peri_inmu_berr = lss_cpu_berr 

all_other_select3 = 0 
when GPIO_base[19:12) 

cpu_gpio_sel = peri_access_en 

P^^i— «nnnj_data « gpio_cpu_data 

P^ri_nnrTiu_rdy = gpio^cpu rdy 

peri_jnrau_berr = gpio_cpu_berr 

all^o therms elects = 0 
when SCB_base (19:121 

cpu_scb_sel = peri_access_en 

perijminu^data = scb_cpu_data 

peri_jnmu_rdy « scb_cpu_rdy 

perijnrou^berr = scb_cpu_berr 

all_other_selects = 0 
when ICUJt>ase(19:12J 

cpu_icu_ael = peri_access_en 
perijnmu_data = icu_cpu_data 
perijnmu^rdly = ic»JL_cpu_rdy 
peri..;nmu_berr = icu_cpu_berr 
all_other_s elects « 0 
when CPR_baae{19:12J 

cpu_cpr_sel = peri_access_en 
perijnmu_data = cpr_cpu_data 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 86 



SoPEC : Hardware Design 



peri_inmu^rdy * cpr_cpu_rdy 
p©ri_ninu_berr = cpr_cpu^berr 
all^other.selects « 0 

when ROH.base(19:121 

cpu_rom_.sel = peri.access.en 
peri_pTOu_data = ronL,cpu„data 
peri_jnmu^rdy s roxtucpu^rdy 
peri_inm\*^berr = ronucpu^berr 
all_other.selects = 0 

when P5Sjba8e[19-.12] 

cpu_pss_ael = peri.acces8_en 
peri_innu_data = pss_cpu.data 
pertjnmu^rdy a pss_cpu_rdty 
peri_xnrau.berr = pss_cpujberr 
all.other^selec.ts * 0 

when DIU_ba5e[I9:12] 

cpu^diu^ael s peri^ccess_en 
peri_pimu_data = diu^cpu.data 
peri_pamu.rdy = diu.cpu_rdy 
peri_inmu_berr = diu_cpu.berr 
aIl„other_select8 = 0 

when PCU_basetl9:12) 

cpu_diu_sel = peri.access.en 
peri_nimt4_data = pcu_cpu_data 
perijirau^rdy ts pcu^cpu^rdy 
peri_jmiu_berr = pc\JL_cpu_berr 
all_other-_fielects = 0 

when others 

I all.bloc1c_selects = 0 

peri.jnmu_data = 0x00000000 
peri_jnmu_rdy c 0 
peri_?nDiu_berr = 1 
end case 



11,6,6.4 MMU Control Block 

The MMU Control Block determines whether every CPU access is a valid access. No more than one cycle 
is to be consumed in determining the validity of an access and ail accesses must terminate with the asser- 
tion of either mmu_cpu_rdy or mmu^cpujbern To safeguard against stalling the CPU a simple bus timeout 
mechanism will be supported. 



I Table 23. MMU Control Block l/Os 











Global SoPEC signals 


prst_n 


1 


In 


Global reset Synchronous to pdk, active low. 


pcSk 


1 


In 


Global dock 


Toplevel/Common MMU Control Block signals 


cpu_adit21:0] 


22 


Out 


Address bus for both DRAM and peripheral access. 


cpu_acocle(1 :0] 


2 


Out 


CPU access code signals (qpi/.m/7)u.acode) retimed to nneet the 
CPU Subsystem Bus timing requirements 


dram.access.en 


1 


Out 


ORAM Access Enable signal. Indicates that the current CPU 
access is a valid DRAM access. 


MMU Control Block to LEON bridge signals 



Doc: SoPEe_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 87 



SoPEC : Hardware Design 



Table 23. MMU Control Block VOa 





Mm 






cpu_mmu_adft31:0) 


32 


In 


CPU core address bus. 


cpu_dataoutI3l.*0J 


32 


In 


Toplevel CPU data bus 


mmu cDij riAiAf^i '(Yi 




Out 


Data bus to the CPU core. Carries the data for all CPU read opera- 
tions 


cpu_rwn 


1 


In 


Toplevel CPU Read/notWrite signal. 


cpu_fnmu.aG0de[1 :0] 


2 


In 


CPU access code signals 


mmii^cpu^fdy 


1 


Out 


Ready signal to the CPU core. Indicates the oompletk^n of all valid 
CPU accesses. 


mmu_cpu_berr 


1 


Out 


Bus Error signal to the CPU core. This sfgnal is asserted to termi- 
nate an invalid access. 


cpu_start_access 


1 


In 


Start Access signal from the LEON bridge indicating the start of a 
data transfer and that the cpu_adr, cpu^dataout, cpu^rym and 
cpu^acoc/e signals are all valid. This signal Is only asserted during 
the first cyde of an access. 


cpujack 


1 


In 


Interrupt Acknowledge signal from the CPU. This signal is only 
asserted during an intemipt acknowledge cyde. 


cpu_ben[1 .-0] 


2 


In 


Byte enable signals indicating which bytes of the 32-bit bus are 
being accessed. 


MMU Control Block to 


OIU Bus Interface signals 


dram_rdy 


1 


In 


Data Ready signal. Indicates the data on the dram_cpu_data bus is 
valid for a read cycle or that the data was successfully dispatched 
to the OIU for a write cyde. 


MMU Control Block to 1 Cache signals 


ic_data{31:0] | 


32 


In 


Data bus from the iCache 


ic_rdy 1 


1 


In 


Ready signal from the ICache indicating the data on to data is valkt 


MMU Control Block to CPU Subsystem Bus Interface signals 


peri.aoces$_en 


1 


Out 


EnaWe Access signal. A penpherel access cannot be initiated 
unless it has been enabled by the MMU Control Unit 


peri.mmu_data(31 :0} 


32 


In 


Data bus from the selected peripheral 


peri_mmu_rdy 


1 


In 


Data Ready signal. Indicates the data on the perLmmu^data bus is 
valid for a read cyde or that the data was successfully written to the 
peripheral for a write cyde. 


pori_mmu_berr 


1 


In 


Bus Error signal. Indicates a bus error has occurred in accessing 
the selected peripheral 



Description: 

The MMU Control Block is responsible for the MMU's core functionality, namely determining whether or 
not an access to any part of the address map is valid. An access is considered valid if it is to a mapped area 
of the address space and if the CPU is nmning in the appropriate mode for that address space. Furthermore 
the MMU control block must coirectly handle the special cases that are: an interrupt acknowledge cycle, a 
reset cxceprion vector fetch, an access that crosses a 256-bit DRAM word boundary and a bus timeout 
condition. The following pseudocode shows the logic required to implement the MMU Control Block 
functionality. It does not deal with the timing relationships of the various signals - it is the designer's 
responsibility to ensure that these relationships are correct and comply with the different bus protocols. 
For simplicity the pseudocode is split up into numbered sections so that the functionality may be seen 
more easily. 
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PSO Description: This first segment of code defines a number of constants and variables that are used 
elsewhere in this description. Most signals have been defined in the I/O descriptions of the MMU sub- 
blocks that precede this section of the document. The post_reset_state variable is used later (in section 
PS4) to determine if we should translate the reset exception vector address or tn^ a null pointer access. 

PSO: 

const UnusedBottoni = Ox002AC0OO 
const DRAMTop = 0x002 7FFFP 
const UserOatASpace » bOl 
const UserPrograinSpace ^ bOO 
const SupervisorDataSpace s bll 
const Supervisor ProgramSpace = blO 

const timeout^limit = 0x40 // Meed to confirm that this is a suitable value 
const ResetExceptionCycles - 0x8 

cpu_adr_perijnasked(7:01 = cpu_inrmi.adr [19:12] 
cpu_adr_dranLJnaskedC16:0J = cpu^nttnu^adr & 0xO03FFFEO 

if (prst^ == 0) then // Initialise everything 
cpu_adr « cpu_inimi_adrC21 : 0] 
peri_accesg_en = 0 
drain_access_en = 0 
ininu_cpu_data = peri_pTOu_data 
mniu_cpu_rdy = 0 
ininu_cpu_berr = 0 
post_reset_state = true 
access_initiated = FALSE 
cpu_access.cnt = 0 

// The following is used to determine if we are coming out of reset for the purposes of 
// reset exception vector redirection. There may be a convenient signal in the CPU core 
// that we could use instead of this. 

if ( {cpu_start_access == 1) AND (cpu_access_cnt < ResetExceptionCycles ) AND 
(clocJc^tick == TRUE) ) then 
cpu_access_cnt = cpu_access_cnt +1 
else 

post_reset_state = FALSE 

PSl Description: This section is at the top of the hierarchy that determines the validity of an access. The 
address is tested to see which macro-region (i.e. Unused, CPU Subsystem or DRAM) it falls into or 
whether the reset exception vector is being accessed. 

PSl: 

if ( cpu_nimu_adr >= UnusedBottom) then 

.// The access is to an invalid area of the address space. See section PS2 

elsif ( (cpu_ninu_adr > DRAMTop) and (cpu_mrau_adr < UnusedBottom)) then 

// We are in the CPU Subsystera/PEP Subsystem address space. See section PS3 

/ / Only remaining possibility is an access to DRAM address space 

// First we need to intercept the special case for the reset exception vector 

elsif (cpu_iTTOU_adr < 0x00000010) then 

// The reset exception is being accessed. See section PS4 

elsif ( (cpu_adr_dram_masked >= RegionOBottom) AND (cpu_adr_dram_masked <= 
Region OTop) ) then 
//We are in RegionO. See section PSS 

elsif ( (cpu_adr_draiiuinflsk.ed >= Reg ionNBot torn) AND (cpu_adr_draiiL.masJced <= 
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RegionNTop) ) then //we are in RegionN 
// Repeat the RegionO (i.e. section PS5> logic for each of Regionl to Region? 

else //We could end up here if there were gaps in the DRAM regions 
peri_acce8s_en- = 0 
draRL.access_en = 0 

mmu^cpu^berr = 1 //we have an unlcnown access error, moat likely due to hitting 
™u--cpu_rdy =0 //a gap in the DRAM regions 

// Only thing remaining is to implement a bus timeout function. This is done in PS6 
end 

PS2 Description: Accesses to the large unused area of the address space are trapped by this section. No 
bus transactioiis are initiated and the mmu^cpujberr signal Is asserted. 

PS2: 

els if (cpu_xnmu_adr >= UnusedBottom) then 

peri_access_en =0 // The access is to an invalid area of the address space 
draii\_access__en = 0 
OTiiu_cpu^berr = 1 
inrnu_cpu_rdy « 0 

PS3 Description: This section deals with accesses to CPU Subsystem peripherals, including the MMU 
itself. If the MMU registers are being accessed then no external bus transactions are required Access to 
the MMU registers is only permitted of the CPU is making a data access from supervisor mode, otherwise 
a bus error is asserted and the access terminated For non-MMU accesses then transactions occur over the 
CPU Subsystem Bus and each peripheral is responsible for determining whether or not the CPU is in the 
correct mode (based on the cpu^acode signals) to be permitted access to its registers. Note that all of the 
PEP registers are accessed via the PCU which is on the CPU Subsystem Bus. 

PS3: 

elsif ( (cpu_nTOu_adr > DRAMTop) AND (cpujnmuL_adr < UnusedBottom) ) then 
//We are in the CPU Subsystem/PBP Subsystem address space 

cpu_adr = cpu_nirau_adr [21:0] 

if <cpu_adr_peri_masked == MMU_base) then // access is to local registers 
peri^acceas_en = 0 
drajn_access_en = 0 

if <cpu_acode SuperviaorDataSi^ce) then 
for (i=0; i<26; i*-*-) { 

if ((i =» cpu_inmu_adr{6:2)) then // selects the addressed register 
if (cpu_rvnn == l) then 

inrou_cpu_data{l6:0J = MMURegtiJ // MMURegCi) is one of the 
inrau_cpu_rdy = 1 // registers in Table 18 

inn\u_cpu_berr = 0 
else //write cycle 

MMURegCi 1 = cpu_dataout f 16 : OJ 
'ranu_cpu_rdy = 1 
mmu^cpu^berr = 0 
else // there is no register mapped to this address 

CTanu_cpu_berr =1 / / do we really want a bus_error here as registers 
n«itiu__cpu_rdy = 0 // are just mirrored in other blocks 

else //we have an access violation 
nnnu^cpu^berr » 1 
inmu_cpu_rdy = 0 

else // access is to something else on the CPU Subsystem Bus 
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peri_access_en =1 
dram_access_en = 0 
nTOu_cpu_data = peri jnmu_data 
nanu_cpu_rdy = peri_inniu_rdy 
nTmu_cpu_berr = peri_nimtL.berr 

PS4 Description: The only correct accesses to the locations beneath 0x00000010 are fetches of the reset 
trap handling routine and these should be the first accesses after reset Here we trap all other accesses to 
these locations regardless of the CPU mode. This most likely cause of such an access will be the use of a 
nuU pomter in the program executing on the CPU. 

PS4: 

elsif (cpu.mmu.adr < 0x00000010) then //may need to translate a wider range - depends 
if (post reset state True)) then // on how LEON handles the reset exception, 
cpu_adr[21:0I - {ROJ^basef 21 :3] , cpu jnmu^adr [ 2 : 0 J ) 
peri.access.en = 1 
drara_access_en = 0 
in™u_cpu_data s perijnniu_data 
iranu_cpu_rdy = peri_pnnu_rdy 
Jranu_cpu_berr e peri jnmujberr 
els© //we have a problem < almost certainly a null pointer) 
peri_access_en = 0 
drairuaccess_en = 0 
iumu_cpu_berr = 1 
nimu_cpu^r<^y = 0 

fxVif ™^ ^^^"^ pseudocode simply checks whether the access is within the bounds 

of DRAM RegionO and if so whether or not the access is of a type pennitted by the RegionOControl regis- 
ter. If the access is permitted then a DRAM access is initiated for all data accesses and for instruction 
fetches that result m a cache miss. All instruction fetches are returned via the ICache interface regardless 
of whether they come from a cache hit or refill from DRAM, If the access is not of a type permitted by the 
KegionOControl register then the access is terminated with a bus error. 

PS5: 

elsif ( (cpu_adr_dranuinasJced >= Reg ionOBot torn) AJTO (cpu^adr draiiL-mas)ced <= 
RegionOTop) ) then // we are in RegionO 

V, ""^^^ ^'^^ c^oss a 256-bit boundary 

// only 16 or 32-bit CPU accesses are capable of traversing a 256-bit boundary 

ie ( { (cpu^tnmu_adrI4:0J OxlF) AND (<cpu_ben == bOl) OR (cpu_ben == biO) ) ) 
OR < (cpujnrou_adr(4:0) == OxlE) AND (cpu_ben blO) ) 
OR ((cpu_inmu^adrl4:03 == OxlD) AND (cpulben blO) ) > then 

peri_access_en = 0 

drani_access_en = o 

n'inu_cpu_berr = 1 

tnniu_cpu_rdy = 0 

else // access does not cross 256-bit boundary so we can proceed 
cpu_adr t= cpu_mmu_adr 121 : 0] 
if (cpu_rwn == 1) then 

if ((cpu_acode == SupervisorProgramSpace AND RegionOControl [2 J ==1)) 
OR (cpu_acode == UserPrograinSpace AND RegionOControl (5) 1)) then 

// this is a valid instruction fetch from RegionO 
peri_access_en = 0 
dram_access_en = 1 
inmu_cpu_data = ic_data 
wmu_cpu_rdy o ic^rdy 
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mmu_cpu„berr = 0 

elsif ({cpu^acode SupervisorDataSpace AND RegionOControKOJ == 1) 
OR (cpu_acode =» UserDataSpace AND RegionOControl [3] 1) > then 

/ / this is a valid read access from RegionO 

peri^access.en = 0 
dranL.access.en = 1 

mmu^cpu^data = drsBL-data // possibly drcdata if dcache is used 
nniu_cpu^<^y s dram^rca^ // possibly drc_rdy 
mmu^cpu Jberr =0 

// we have an access violation 

peri_access_en = 0 
drazcL.acces8_en = 0 

niniu„cpu.rdy = 0 

// it is a write access 
if ((cpu^acode == SupervisorDataSpace AND RegionOControl [1] == i) 

OR (cpu_acode === UserDataSpace AND RegionOControl [4] 1)) then 

// this is a valid write access to RegionO 

peri_access_en = 0 
drazi\_access_en = 1 

n«iiiu_cpu.rdy = dram^rdy // possibly dwc_r<S^ if dcache is used 
inmu_cpu_berr = 0 

//we have an access violation 

peri_access_en o o 
draxq_access_en = 0 
™u— cpu_berr = I 
rnmu_cpu_rdy = 0 

PS6 Description: This final section of pseudocode deals with the special case of a bus timeout This 
occurs when an access has been initiated but has not completed before the timeoutjimit number of pclk 
eye es While access to both DRAM and CPU/PEP Subsystem registers will take a variable numbeVof 
cycles (due to DRAM traffic, PCU command execution or the different timing required to access registers 
m miported IP) each access should complete before the timeoutjimit occurs. Therefore it should not be 
possible to stall the CPU by locking either the CPU Subsystem or DIU buses. However given the fatal 
eflect such a stall would have it is considered prudent to implement bus timeout detection. 

PS6: 

// only thing remaining is to implement a bus timeout function. 

if ( (cpu_8tart_acce8s fc= 1) then 
access.initiated = TRtJB 
timeout_countdown a BusTimeout 

if ( (mmu_cpu_rdy == 1 > OR (mmu_cpu_berr =8i >) then 
access_initiated = FALSE 
peri_access_en = 0 
dranv_acces s_en = 0 

if ((clock^tick == TRUE) AND (access.initiated == TRUE)) 
if (tiTOeout_countdown > O) then 

t imeout_countdown- - 
else // timeout has occurred 

peri_access_en » 0 // abort the access 

drajtL.access_en e 0 

nimu_cpu_berr . = 1 

inmu«cpu_rdy = 0 
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I 11.6.6.5 iCache 

The ICache sub-block implementation is described in section 1 1.7.1.1. 



S5 



11.7 Cache 



The decision on what type of caching solution to use on SoPEC is still open for the moment There are 
two probable solutions: a) use the LEON caches with a minimal configuration (1 KB I and D caches) and 
b) use separate, simple one Hne 256-bit caches for instrucrion, data read and data write accesses From a 
performance and (most likely) implementation point of view the LEON caches are the best solution how- 
ever they are much bigger than the one line caches (approx 6x). The one line caches do not offer the same 
degree of performance improvement as the LEON caches and are likely to add an extra cycle to aU mem- 
ory accesses. The performance penalty for a LEON cache miss (i.e. for all memory accesses if we arc not 
usmg the LEON caches) and the the best and worst case access times from DRAM have yet to be fiilly 
determined The final decision on which caching solution to use will be made when all such information is 
available. 

Therefore the section on caches, which was present in previous versions of this document but is now 
mostly out of date, has been removed (the ICache is still relevant if one line caches are used and so is 
retained). 



11.7.1 Instruction Cache 



A caching mechamsm would offer the advantage of greater aggregate performance while still guaranteeing 
a mimmum level of performance. While greater perfonmance may not be required at present for this appli- 
cation the caching mechanism offers greater efficiency (i.e. MIPS/MHz) and so the CPU clock could be 
reduced without affecting, or only negligibly affecting, the operating performance. The advantage here is 
that the design is scalable - better performance can be achieved by simply increasing the clock rate. 
As all reads from the embedded DRAM on SoPEC produce words that are 256 bits wide it is inefficient to 
hook this up to a 32-bit CPU bus as 224 bits of each read would be discarded. If the full 256-bit word is 
stored locally to the CPU as a single-line cache then a ??x performance improvement could be obtained in 
the typical case (this is of course highly code dependent). This single line cache would be very easy to 
implement as it would just involve the address to be compared to a single tag and no replacement algo- 
rithm would be required. Furthermore the area impact would be minor and there should be no performance 
penalty for cache misses. As the dramjzpu^data bus is 256 bits wide the requested word is immediately 
available to the CPU i.e. we do not need to perform critical word first reordering of the data. 
The instruction cache is only accessed for instruction fetches, not all CPU reads. These can be differenti- 
ated by signals emanating from the CPU. Non-instruction CPU reads would be supported by the data 
cache. In the case of a cache miss the read request is processed by the MMU to ensure the request is valid 
before a read request is generated on the relevant external (to the CPU block) bus. The MMU should be 
informed of a cache hit to ensure it does not generate an unneccessary read request. This requires that the 
regions used to store code are aligned on 32-byte (256-bit) boundaries. 

As there is no requirement to have more time deterministic code execution the instruction cache cannot be 
disabled. 



11.7.1.1 iCache implementation 



The Instruction Cache used in SoPEC is capable of storing just a single 256-bit DRAM word. An imple- 
mentation IS depicted in Figure 22 below. The block I/Os are given in Table 24 and these should be viewed 
m conjunction with Figure 19 and Figure 20 for a complete depiction of the connectivity of the block. 
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Rgure 22. ICache Block Diagram 




Table 24. ICache l/Os 




Global SoPEC signals 



prst_n 


1 


in 


Global reset. Synchronous to pc£k; active low. 


pdk 


1 


In 


Gtobal dock 


Toplevel ICache signals 




dram_cpo_data{255:0J 


256 


in 


Data bus from the OIU 


cpLi_acode(1 .-0] 


2 


in 


CPU access control signals 


cpu_adit2l:2] 


20 


in 


CPU core address bus. 


ICache to OIU Bus Inten 


face signals 


fc_cache_hft 


1 


Out 


Cache hit signal. This indicates that the cun-ent CPU read request 
Is being serviced by the ICache and so should not be retrieved from 
the DRAM. 


dram_rdy 


1 


In 


Data Ready signal. Indicates the data on the dram cpu dal^busis 
valid. " 


ICache to MMU Control Block signals " ~ 


ic.<lata(31:0] j 


32 1 


Out 1 


ICache data bus 
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Description: 

The Tag stores the DRAM word address of tiie word cuzrentiy in cache. The Tag contents are compared 
with cpujadr[21:5J each time the CPU requests an instruction fetch from a valid DRAM address (indi- 
cated by cpu_acode[0] and dram_access_en). If a match occurs (i.e. a cache hit) the access is serviced by 
returning the correct 32 bits (as selected by cpu_adr[4:2]) to the MMU Control Block. If a match does not 
occur (i.e. a cache miss) the ic_cachejiit line is held low indicating to the DIU Bus Interface that a 
DRAM access should commence. Completion of the DRAM access is signalled by the assertion of 
dram^rdy and this causes the ICache contents to be updated, the Tag value replaced and the relevant 32 
bits forwarded to the CPU accompanied by the assertion of the ic_rdy signal. It is updated each time the 
cache line is refilled from DRAM. All instruction fetches from DRAM arc cacheable, regardless of which 
DRAM region is being accessed (although the access permissions still need to match those programmed 
for the region) and whether the CPU is in user or supervisor mode. 



11.7.2 Data Cache 

11 .8 Realtime Debug Unit (RDU) 

The RDU facilitates the observation of the contents of most of the CPU addressable registers in the SoPEC 
device in addition to some pseudo-registers in realtime. The contents of pseudo-registers, i.e. registers that 
are collections of otherwise unobservable signals and that do not affect the functionality of a circuit, are 
defined in each block as required. Many blocks do not have pseudo-registers and some blocks (e.g. ROM , 
PSS) do not make debug information available to the RDU as it would be of little value in realtime debug. 

Each block that supports realtime debug observation features a DebugSelect register that controls a local 
mux to determine which register is output on the block's data bus (i.e. block_cpu_data). One small draw- 
back with reusing the blocks data bus is that the debug data cannot be present on the same bus during a 
CPU read from the block. An accompanying active high block_cpu_debug_valid signal is used to indicate 
when the data bus contains valid debug data and when the bus is being used by die CPU. There is no arbi- 
tration for the bus as the CPU will always have access when required. A block diagram of the RDU is 
shown in Figure 23. 
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Figure 23. Realtime Debug Unit block diagram 



Table 25. RDU l/Os 









diu_cpu_data 


32 


In 


Read data bus from the DIU block 


cpr_cpu_data 


32 


In 


Read data bus from the CPR block 


gpio_cpu.data 


32 


In 


Read data bus from the GPIO block 


lcu_cpu_data 


32 


In 


Read data bus from the ICU block 


lss_cpu_dala 


32 


In 


Read data bus from the LSS bfock 


pcu_cpu.debug_data 


32 


In 


Read data bus from the PCU block 
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Table 25. RDU l/Os 



mmmm 


11. 






scb_cpu_clata 


32 


In 


Read data bus from the SCB block 


Um_cpu_dala 


32 


In 


Read data bus from the TIM block 


cliu_cpu^d&b>UQ_v3 Iki 


4 

1 . 


In 


^gna) Indicating the data on the dlujcpu_data bus is valid debug 
data. 


(im.cpu_debug_valid 


1 


In 


Signal Indicating the data on the tjm_cpu_daia bus is valid debug 
data. 


8Cb_cpu.debug.vafid 


1 


In 


Signal indicating the data on the scb^cpuLdata bus Is valid debug 
data. 


pcu_cpu_debug.valid 


1 


In 


Signal indicating the data on the pcu^cpujdata bus is valid debug 
data* 


lss.cpu.debua.valU 


1 


In 


Signal indicating the data on the Iss^cpujdata bus is valid debug 
data. 


icij_cpu_debug_valld 


1 


In 


Signal Irullcating the data on the icu^cpu_data bus is valid debug 
data. 


Opio_cpu_debug_valid 


1 


In 


Signal indk:ating the data on the gpio_cpu_data bas is valid debug 
data. 


cpr_cpu_debuo_val»d 


1 


In 


Signal indicating tiie data on the cpr_cpujdata bus is valid debug 
data. 


debug.data^out 


18 


Out 


Output debug data to be muxed on to the PH l/GPlO/other pins 


debug.data.valld 


1 


Out 


Debug valid signal indicating the validity of the data on 
debug_data_out. This signal is used in all debug configurations 


debug^cntrl 


19 


Out 


Control signal for each PHI bound debug data line indicating 
whether or not the debug data shouM be selected by the pin mux 



As there are no spare pins that can be used to output the debug data to an external capture device some of 
the existing I/Os will have a debug multiplexer placed in front of them to allow them be used as debug 
pins. Unfortunately many of the pins on SoPEC cannot even be multiplexed in this fashion so it will not be 
possible to output a full 32-bit debug data word every cycle. The exact number of pins available for multi- 
plexing had yet to be finalised at the time of writing. This specification assumes 20 pins will be available 
but this can easily be revised up or, more likely, down. Furthermore not every pin that has a debug mux 
will always be available to carry the debug data as they may be engaged in their primary purpose e.g. as a 
GPIO pin. The RDU therefore outputs a debugjcntrl signal with each debug data bit to indicate whether 
the mux associated with each debug pin should select the debug data or the normal data for the pin.The 
DebugPinSel is used to determine which of the 20? potential debug pins arc enabled for debug at any par- 
ticular time. 

As it is not possible to output a full 32-bit debug word every cycle the RDU supports the outputting of an 
n-bit sub- word every cycle to the enabled debug pins. Each debug test would then need to be re-run a num- 
ber of times with a different portion of the debug word being output on the n-bit sub-word each time. The 
data from each run should then be correlated to create a full 32-bit (or whatever size is needed) debug 
word for every cycle. The debug_data_valid and pclk_put signals will accompany every sub-word to allow 
the data to be sampled correctly. The pclk_out signal is sounced close to its output pad rather than in the 
RDU to minimise the skew between the rising edge of the debug data signals (which should be registered 
close to their output pads) and the rising edge of pclk_out. 

As multiple debug runs will be needed to obtain a complete set of debug data the n-bit sub- word will need 
to contain a different bit patterii for each run. For maximum flexibility each debug pin has an associated 
DebugDataSrc register that allows any of the 32 bits of the debug data word to be output on that particular 
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debug data pin. The debug data pin must be enabled for debug operation by having its coiresponding bit in 
the DebugPinSel register set for the selected debug data bit to appear on the pin. 

The size of the sub-word is deteimined by=the number of enabled debug pins which is controlled by the 
DebugPinSel register. Note that the debugjdata_valid signal is always output. Furthermore 
debug_cntrl[OJ (which is configured by DebugPinSel fOJ) controls the mux for both the debugJUita^valid 
Bndpclk_j)ut signals as both of these must be enabled for any debug operation. 

The mapping of debug_data_out[n] signals onto individual pins will take place outside the RDU. When 
the exact mapping has been finalised it will be recorded here. A proposed mapping is shown in Table 26 
below. 



Table 26. Example DebugPinSel mapping 







0 


phLfrdk. The cteix/flLtfat^vo/zd signal will 
appear on this pin when enabled. Enabling this 
pin also automaticany enables the phi.readl pin 
which will output the pc//<.out signal 


1 


phl^pfofilo 


2 


ptiijsynd 


3 


test pin 1 


4 


test pin2 


5-18 


gpiot0...13] 



Table 27. RDU Configuration Registers 













0x80 


DebugSrc 


4 


0x00 


Denotes which block is supplying the debug 
data. The encoding of this block is given betow. 
0-MMU 
1 - TIM 

2- LSS 

3- GRID 

4- SCB 

5- ICU 

6- CPR 

7- DIU 
8 - PCU 


0x84 


DebugPinSel 


19 


0x0.0000 


Determines whether a pin is used for debug data 
output, A provisional mapping of pin to bit posi- 
tion is given in Table 26. 
1 - Pin outputs debug data 
0 - Normal pin function 


0x88to0xCC 


OebugDataSrcN 


5 


0x00 


Seiects which bit of the 32-bit debug data word 
will be outputted on debug data out[N] 



1 1 .9 Interrupt Operation 



The interrupt controller unit (see chapter 14) generates an interrupt request by driving interrupt request 
lines with the appropriate interrupt level, LEON supports 15 levels of interrupt with level 15 as the highest 
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level (the SPARC architecture manual [32] states that level 15 is non-maskable but we have the freedom to 
mask this if desired). The CPU will begin processing an interrupt exception when execution of the current 
instruction has completed and it will only do so if the interrupt level is higher than the current processor 
priority. If a second interrupt request arrives with the same level as an executing interrupt service routine 
then the exception will not be processed until the executing routine has completed. 

When an interrupt trap occurs the LEON hardware will place the program counters (PC and nPC) into two 
local registers. The interrupt handler routine is expected, as a minimum, to place the PSR register in 
another local register to ensure that the LEON can correcdy return to its pre-interrupt state. The 4-bit inter- 
rupt level (irl) is also written to the trap type (rt) field of the TBR (Trap Base Register) by hardware. The 
TBR then contains the vector of the trap handler routine the processor will then jump. The TEA (Trap 
Base Address) field of the TBR must have a valid value before any interrupt processing can occur so it 
should be configured at an early stage. 

Interrupt pre-emption is supported while ET (Enable Traps) bit of the PSR is set This bit is cleared during 
the initial trap processing. In initial simulations the ET bit was observed to be cleared for up to 30 cycles. 
This causes significant additional inteirupt latency in the worst case where a higher priority iiUerrupt 
arrives just as a lower priority one is taken. 

The internet acknowledge cycles shown in Figure 24 below are derived from simulations of the LEON 
processor and accompanying interrupt controller. This interrupt controller will be replaced by the ICU in 
the SoPEC design. The LEON signal names are used for fixture reference. An interrupt is asserted by driv- 
ing its (encoded) level on the mUrl[3:0] signals. The LEON core responds to this, with variable timing, by 
reflecting the level of the taken interrupt on the iuo,irl[3:0] signals and asserting the acknowledge signal 
iuo.intack.The interrupt controller then removes the interrupt level one cycle after it has seen the level been 
acknowledged by the core. If there is another pending interrupt (of lower priority) then this should be 
driven on iuiArl[3:0] and the CPU will take that interrupt (the level 9 interrupt in tlws example below) once 
it has finished processing the higher priority interrupt The iuoJrl[3:0] signals always reflect the level of 
the last taken interrupt, even when the CPU has finished processing all intemipts. 



pclk 



iui.irt[3:0] 



0x0 



I 



OxS 



i"alrlI3:0] ^^^^ZZ^ 
iuoJntack 



lui.irlI3:0] |_ 



0x9 



0x8 



iuo.irl{3;0] | OxA 
iuo.intack 



0x9 



Figure 24. Interrupt acknowledge cycles for a single and pending interrupts 
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11.10 Boot Operation 

See section 17.2 for a description of the SoPEC boot operation. 

11.11 Software Debug 

I Software debug mechanisms are discussed in the "SoPEC Software D^ug" document (15]. 
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12 Serial Communications Block (SCB) 



12.1 Overview 

The Serial Communications Block (SCB) handles the movement of all data between the SoPEC and the 
host device (i.e. PC) and between master and slave SoPEC devices. The SCB consists of a USBl.l device 
controller, an Inter-SoPEC Interface QSl) and a DMA manager. A block diagram of the SCB is shown in 
Figure 25 below. The major blocks of the SCB, namely the ISI, USB and DMA manager, could be imple- 
mented as separate blocks but are integrated to take advantage of the performance gains and design simpli- 
fications that a tighter coiq)ling allow. 




usb_clk 
^ usb_cpr_reset_n 

cpu_adr(n:2l 
cpu_dataout[31 :0] 
scb_cpu_data[31 :0J 

cpu_scb_seJ 
cpu_rwn 
cpu_aco<ie[2:0J 
scb_cpu_rdy 
scb_cpu_berr 
dmajcujrq 
isLicuJrq 
usbjcujrq 

scb_diu_wadfl21 :5] 
scb_diu_data[63:0] 
scb_diu_wreq 
diu„scb_wack 
scb„diu_wvalid 



► scb_cpu_debug„vaild 



isLcpr_reset_n 



prst n 
pclk 



Figure 25. Serial Communications Block 

The USB Controller will be an imported piece of IP. There are many possible sources of this block but it is 
hkely that it will be suppHed by the silicon vendor - all three current silicon vendor candidates will supply 
USB LI controllers, although some of these have been sourced from a third party. 
The SCB can be seen in the context of the overall SoPEC device in Figure 26 below 
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CPU sub-system 



CPU 



RDU 



MMU 



Maste 



TIM 4 



.Slave 



Boot ROM 



Slave 



ICU 



Slave 



PSS ^ 



Stave 



<USBf Host 



SCB I 



USB PHY 



USB 
Device 



JSI 



< — ► 



DMA 
CrtI 



I 

( 

I 
I 
I 

- I 
— I 



^^otor Control, 
LSS. ISt, 
LED. etc. 



i 



CPR 



Slave 



GPIO 



LSS 
Master 



Slave 



Slaver 



CPUSub^lem 
■ Bus 



eDRAM 



DRAM sub*system 



DIU 
AAA 



Slave 



PCU 4 



DRAMJxis 
u 



Print Engine Pipeline sub-system 



CDU 



CFU 



Master 



LBD 



SFU 



TE 



TFU 



HCU 



DNC 4 



DWU 



LLU 



PHI 



PEP Configuration Bus 



Figure 26. SoPEC toplevel block diagram 
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12.2 Definitions of I/Os 



Table 28. Serial Communications Block I/O 









Clocks and Resets 


prst_n 


1 


In 


System reset signal. Active tow. 


pdk 


1 


In 


System dock. . 


usb_clk 


1 


In 


Oodc Ibr the USB controller block. 


IsLcpr.reset.n 


1 


Out 


Signal from tt>e IS! Indicating that ISI activity has been detected 
while in sleep mode and so the chip shoutd be reset Active low. 


usb_cpr_rsset_n 


1 


Out 


Signal from the USB controller that a USB reset has occurred. 
Active low. 


CPU Interface 


cpu.adr[n:2] 


n-1 


In 


CPU address bus. Exact width rs cunentiy TBD as It is dependent 
on the address maps of imported IP ' 


cpu_dataout(31:0] 


32 


in 


Shared write data bus from the CPU 


scb_cpu_clata(31.'0J 


32 


Out 


Read data bus to the CPU 


cpu_rwn 




In 


Common read/not-write signal from the CPU 


cpujc[2:0] 




In 


CPU Function Code signals. 


cpu_scfa_set 




In 


Blocl< select from the CPU. When cpi/^scfiLse/ls high both Cfiu adr 
af\6 cpu^dataout are vaWd 


scb_cpu_fdy 




Out 


Ready signal to the CPU. When scb^cpu^rdyls high it indicates the 
last cycle of the access. For a write cyde this means x^jdataout 
has been registered by the SCB and for a read cyde this means the 
data on scb_cpu_<iata is vaiid. 


scb.cpu.berr 




Out 


Bus error signal to the CPU indicating an invalkl access. 


scb_cpu_debug_vand 




Out 


Signal indicating that the data currently on scb_cpu_d^ea is valid 
debug data 


Interrupt signals 


dmajcu_irq 




Out 


DMA interrupt signal to the interrupt controller block. 


IsUcuJrq 




Out 


iS( interrupt signal to the interrupt contraller block. 


i/sb_lcujrq 




Out 


USB interrupt signal to the interrupt controller bfock. 


DIU interface 


scb_diu_wadrf21:5] 


17 


Out 


Write address bus to the DIU 


scb_diu_dataI63,1D] 


64 


Out 


Data bus to the DIU. 


scb_diu_wreq 




Out 


Write request to the OIU 


diu_scb_wack 




In 


Acknowledge from the DiU that the write request was accepted. 


scb.dKj.Vfvalid 




Out 


Signal from the SCB to the DiU indicating that the data cunently on 
the scb_diu_data[63.-0] bus is valid 


GPIO Interface 


isLgpk)_doiit(1:0) 


2 


Out 


ISP output data to GPIO pins 


tsi_flpio_e[1:0J 


2 


Out 


ISI output enable to GPIO pins 


flpioJsLdin(1:0J 


2 


In 


Input data from GPIO pins to ISI 
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I 12.3 MULTI-SOPEC SYSTEMS 



While single SoPEC systems are expected to form the majority of SoPEC systems the SoPEC device must 
also siq)poit its use in multi-SoPEC systems such as that shown in Figure 27 below, A SoPEC may be 
assigned any one of a number of identities in a multi-SoPEC system. A SoPEC may be one or more of a 
PrintMaster, a LineSyncMaster, an ISIMaster, a StorageSoPEC or an ISISlave SoPEC 



USB from Host ^ 




Figure 27. A3 duplex system featuring four printing SoPECs with a sinoTe"^ 

SoPEC DRAIVI device 



12.3.1 ISIMaster device 

The ISIMaster is the only device allowed to drive the common ISI line (see Figure 28) and interfaces 
directly with the host. In most systems the ISIMaster will simply be the SoPEC connected to the USB bus. 
Future systems, however, may employ an ISI-Bridge chip to interface between the host and the ISI bus and 
in such systems the ISI-Bridge chip will be the ISIMaster. There can only be one ISIMaster on an ISI bus. 



12.3.2 PnntMaster device 

The PrintMaster device is responsible for co-ordinating all aspects of the print operation. This includes 
starting the print operation in all printing SoPECs and communicating status back to the host. When the 
ISIMaster is a SoPEC device it is also likely to be the PrintMaster as well. There may only be one Print- 
Master in a system and it is most likely to be a SoPEC device. 
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12.3.3 LfneSyncMaster device 

TTie LineSyncMaster device generates the Isync pulse that aU SoPECs in the system must synchronize 
theu- hne outputs with. Any SoPEC in the system could act as a LineSyncMaster although the PrintMaster 
IS probably the most likely candidate. It is possible that the LineSyncMaster may not be a SoPEC device at 
all - it could, for example, come from some OEM motor control circuitry. There may only be one LineSyn- 
cMaster in a system. 

1 2.3.4 Storage device 

For certain printer types it may be realistic to use one SoPEC as a storage device without using its print 
engine capability - that is to effectively use it as an ISI-attached DRAM. A storage SoPEC would receive 
data from the ISIMaster (most likely to be an ISI-Bridge chip) and then distribute it to the other SoPECs as 
required No other type of data flow (e.g. ISISIave -> storage SoPEC -> ISISlave) would need to be sup- 
ported in such a scenario. The SCB supports this functionality at no additional cost because the CPU han- 
dles the task of transferring outbound data from the embedded DRAM to the ISI transmit buffer. The CPU 
in a storage SoPEC will have almost nothing else to do. 

12.3.5 ISISIave device 

Multi-SoPEC systems will contain one or more ISISIave SoPECs. An ISISIave SoPEC is primarily used to 
generate dot data for the printhead IC it is driving. 

12.3.6 ISI-Bridge device 

SoPEC is targeted at the low-cost small office / home ofiice (SoHo) market. It may also be used in fiiture 
systems that target different market segments which are likely to have a high speed interface capability A 
future device, known as an ISI-Bridge chip, is envisaged which will feature both a high speed interface 
(such as USB2.0, Ethernet or IEEE1394) and one or more ISI interfaces. The use of multiple ISI buses 
would allow the construction of independent print systems within the one printer. The ISI-Bridge would be 
the ISIMaster for each of the ISI buses it interfaces to. 

12.3.7 Host device 

Tlie host device will invariably be, but is not required to be, a PC. Any device that can act as a USB host or 
that can mterface to an ISI-Bridge chip could be the host device. In particular, with the development of 
USB On-The-Go (USB OTG), it is possible that a number of USB OTG enabled products such as PDAs or 
digital cameras will be able to directly interface with a SoPEC printer. 

12,4 Types OF COMMUNICATION 

12.4.1 Communications with host 

"^^^^^y "^'^^ ISIMaster in order to print pages. When the ISIMaster is a 
SoPEC, the communications channel is USBl.L 

f 2.4. i. i Host to ISIMaster communication 

The host will need to conununicate the following information to the ISIMaster device: 

• Communications channel configuration and maintenance information 

• AU data destined for PrintMaster. ISISIave or storage SoPEC devices. This data is simply relayed by 
the ISIMaster 

• Mapping of virtual communications channels, such as USB endpoints, to ISI destination 
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12.4,1.2 iSiMaster to host communication 

The ISIMaster will need to communicate the following information to the host: 

• Commimicaticns channel configuration and maintenance information 

• All data originating from the PrintMaster, ISISlave or storage SoPEC devices and destined for the host 
This data is simply relayed by the ISIMaster 

f 2.4, f . 3 Host to PrintMaster communication 

The host will need to communicate the following information to the PrintMaster device: 

• Program code for the PrintMaster 

• Compressed page data for the PrintMaster 

• Control messages to the PrintMaster 

• Tables and static data required for printing e.g. dead no2zle tables, dither matrices etc. 

• Authenticatable messages to upgrade the printer's c^abilides 

12.4.1.4 Printfyiaster to host communication 

The PrintMaster will need to communicate the following information to the host: 

• Printer status information (i.e. authentication results, paper empty^jammed etc.) 

• Dead nozzle information 

• Memory buffer status information 

• Power management status 

• Encrypted SoPEC Jd for use in the generation of PRINTER.QA keys during factory programming 

12.4.1.5 Host to iSiSiave communication 

All communication between the host and ISISlave SoPEC devices must take place via the ISIMaster. In 
the case of a SoPEC ISIMaster it is possible to configure each individual USB endpoint to act as a control 
channel to an ISISlave SoPEC if desired, although the endpoints will be more usuaUy used to transport 
data. The host will need to communicate the following information to ISISlave devices over the comms/ 
ISl: 

• Program code for ISISlave SoPEC devices 

• Compressed page data for ISISlave SoPEC devices 

• Control messages to the ISISlave SoPEC (where a control channel is supported) 

• Tables and static data required for printing e.g. dead nozzle tables, dither matrices etc. 

• Authenticatable messages to upgrade the printer's capabilities 

12.4.1.6 iSiSIave to host communication 

All communication between the ISISIave SoPEC devices and the host must take place via the ISIMaster. 
The ISISIave will need to communicate the following information to the host over the comms/ISl: 

• Responses to the host's control messages (where a control channel is supported) 

• Dead nozzle information from the ISISIave SoPEC. 

• Encrypted SoPEC Jd for use in the generation of PRINTER_Q A keys during factory programming 
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12.4.2 Communication over iSI 



iZ4.2.i . iSiMaster to PrintMaster communication 



The ISIMaster and PrintMaster will often be the same physical device. When they arc different devices 
then the following information needs to be exchanged over the ISI: 

• All data from the host destined for the PrintMaster (see section 1 2.4.1.3). This data is simoly relayed 
by the ISIMaster f/ / 

12.4,2,2 PrintMaster to ISIMaster communication 

The ISIMaster and PrintMaster will often be the same physical device. When they are different devices 
then the following information neieds to be exchanged over the ISI: 

• All data from the PrintMaster destined for the host (sec section 12.4.1.4). This data is simply relayed 
by the ISIMaster 



12,4.2.3 ISIMaster to ISISIave communication 

The ISIMaster may wish to communicate the following information to the ISISlaves: 

• All data (including program code such as ISIId enumeration) originating from the host and destined for 
the ISISIave (see section 1 2.4. 1 .5). This data is simply relayed by the ISIMaster 

• wake up from sleep mode 



12.4.2,4 ISISIave to ISIMaster communication 

The ISISIave may wish to communicate the following information to the ISIMaster 
• All data originating from the ISISIave and destined for the host (sec section 12.4. 1.6). This data is sim- 
ply relayed by the ISIMaster 



12,4,2.5 PrintMaster to ISISIave communication 

When the PrintMaster is not the ISIMaster all ISI communication is done in response to ISI ping packets 
(see 12,6.4.5). When the PrintMaster is the ISIMaster then it will of course commimicate directly with 
the ISISlaves. The PrintMaster SoPEC may wish to conununicate the following information to the ISISla- 
ves: 

• Ink status e.g. requests for dotCouns data i.e. the number of dots in each color fired by the printhcads 
connected to the ISISlaves 

• configuration of GPIO ports e.g. for clutch control and lid open detect 

• power down command telling the ISISIave to enter sleep mode 

• ink cartridge fail information 

This list is not complete and the time constraints associated with these requirements have yet to be deter- 
mined. 

In general the PrintMaster may need to be able to: 

• send messages to an ISISIave which will cause the ISISIave to return the contents of ISISIave registers 
to the PrintMaster or 

• to program ISISIave registers with values sent by the PrintMaster 

This should be under the control of software running on the CPU which writes messages to the ISI/SCB 
interface. 
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iZ4. 2. 6 iSISiave to PrintMaster communication 

ISISlaves may need to communicate the following information to the PrintMaster: 

• ink status e.g. dotCount data i.e. the number of dots in each color fired by the printheads connected to 
the ISISlaves 

• band related information e.g. finished band interrupts 

• page related infomiation i.e.buffer underrun, page finished interrupts 

• MMU security violation interrupts 

• GPIO interrupts and status e.g. clutch control and lid open detect 

• printhead temperature 

• printhead dead nozzle information from SoPEC printhead nozzle tests 

• power management status 

This list is not complete and the time constraints associated with these requirements have yet to be deter- 
mined. 

As the ISI is an insecure interface commands issued over the ISI should be of limited capability e.g. only 
limited register writes allowed The software protocol needs to be constructed with this in mind In general 
ISISlaves may need to return register or status messages to the PrintMaster or ISIMaster. They may also 
need to indicate to the PrintMaster or ISIMaster that a particular interrupt has occurred on the ISISiave. 
This should be under the control of software running on the CPU which writes messages to the ISI block. 

12.4.2.7 iSiSiave to iSiSiave communication 

It is currently not anticipated that there will be any direct conamunication between ISISiave SoPECs. How- 
ever they can conmmnicate indirectly via the ISIMaster SoPEC. The most likely scenario for such a com- 
munication mechanism when the PrintMaster is not the ISIMaster (see sections 12.4.2.5 and 12.4.2.6 fora 
description of the information exchanged between a PrintMaster and an ISISiave). ISISiave to ISISiave 
communication would also be required when sending data stored in a storage SoPEC device to an 
ISISiave. 



12.5 USB 

The USB 1 . 1 interface for the printer should consist of the USB connector, the necessary discretes for USB 
signalling and the SoPEC device. A SoPEC printer will act as a self-powered, full-speed device and 
SoPEC itself will not draw any power from the USB cable. It will support control and bulk transfers. 
Interrupt transfers are not considered necessary because the required interrupt-type functionality can be 
achieved by sending query messages over the control channel on a scheduled basis. There is no require- 
ment to support either isochronous or low-speed transfers. The USB controller must support at least 5 
USB endpoints: a control endpoint (endpoint 0) and 4 bulk-data type cndpoints. These 4 bulk-data type 
cndpoints can be used for the transfer of any type of data: compressed page data, program data or control 
messages. They may also be mapped on to any target destination in a multi-SoPEC system i.e. configura- 
tion is completely programmable. They are envisaged as always being used as USB IN endpoints i.e. they 
will transport data from the host to SoPEC. Any feedback data (e.g. status information) will be returned to 
the host on the control channel (endpoint 0). 

The USB device enumeration process will be handled by the SoPEC CPU and USB controller. Note that 
this requires the on-chip ROM to contain all the required USB driver code. This is not expected to be the 
full USB driver but rather a "USB-lite" driver that has sufficient functionality to download a program to 
DRAM. ^ ^ 

Details of the configuration registers and interface signals will be provided when the implementation IP 
for the USB controller core has been selected There are several potential candidates for the USB 1.1 con- 
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trollcr that are being evaluated in terms of cost, maturity, licensing requirements/restrictions, quality of 
deliverables etc. - as already mentioned the choice of silicon vendor is likely to play a large part in select- 
ing the USB controller. 

12.5.1 ISIMaster/ISISIave Identification 

While the USB controller is used for data transfer if a SoPEC is an ISIMaster it may, in certain cases, also 
be used to transfer data to an ISISlave. If the USB is not used for data transfer the device will certainly be 
an ISISlave. In this case the USB pins could be used to identify the device as an ISISlave as the USB 
device controller is expected to allow the single>ended quiescent state of the USB pins to be read by the 
CPU either directly or indirectly (as there should be a register indicating whether the USB controller is 
operating as a full-speed or low-speed device). We adopt the convention that an ISIMaster SoPEC has its 
USB pins configured for full-speed operation (i.e. a pull-up resistor on D+) and an ISISlave SoPEC has its 
USB pins configured for low-speed operation (i.e. a pull-up resistor on D-). This allows the ROM boot- 
code to quickly determine whether the SoPEC is an ISIMaster or ISISlave without needing to wait for 
USB activity. While the ISISlave SoPEC's USB controller believes it is a low-speed device it is never used 
and may be disabled completely (if possible) once the device has been identified as an ISISlave. Note that 
other combinations on the I>+ and D- lines may result in unreliable operation of the USB controller. 

The SoPECs identity as an ISIMaster or ISISlave may also be determined firom USB or ISI activity. If 
activity is seen on USB endpoints 2-4 then the device is an ISIMaster (note that it is not neccessarily an 
ISIMaster if activity is only seen on en(^oints 0 or 1) and the ISI may automatically configure itself as an 
ISIMaster in this situation. If Ac ISI receives ping packets then it is an ISISlave as only the ISIMaster can 
send ping packets. 

The most suitable ISIMaster/ISISIave identification scheme (i.e. use of USB pins or looking for USB/ISI 
activity) can be chosen by the software for any given printer. 

12.5^ Wake-up from sleep mode 

The SoPEC will be placed in sleep mode after a suspend command is received by the USB controller. The 
extent of power-down in sleep mode is currently TBD (different silicon vendors offer different options) 
but it is expected to involve the loss of DRAM contents at a minimum. The USB controller (or portions of 
it) will continue to be powered and clocked in sleep mode. It is likely that a USB reset, as opposed to a 
device resume, will be required to bring SoPEC out of its sleep state as the sleep state is hoped to be logi- 
cally equivalent to the power down state. The exact reawakening mechanism will be finalised when the 
sleep state is more precisely defined and the particular implementation of the USB controller is chosen. 

The USB reset signal originating from the USB controller will be propagated to the CPR (as 
usb_cpr_reset_n) if the USBWakeupEnable bit of the WakeupEnable register (see Table 38) has been set. 
The USBWakeupEnable bit should therefore be set just prior to entering sleep mode. 

There arc no conditions that require the SoPEC to initiate a USB device wake-up (i.e. where SoPEC sig- 
nals resume to die host after being suspended by the host). 

12.5.3 USB Speed 

The USB speed will be determined by amount of activity from other devices that share the USB bus with 
the printer and the responsiveness of the host in handling USB tntenupts. To guarantee bandwidth to the 
printer it is recommended that no other devices are active on the USB bus between the printer and the host. 
If the printer is connected to a USB2.0 host or hub it may limit the bandwidth available to other devices 
connected to the same hub but it would not significantly affect the bandwidth available to other devices 
upstream of the hub. Used in the recommended configuration it is expected that an effective bandwidth of 
8-9 Mbit/s will be achieved. 
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12.6 ISi (Inter SoPEC Interface) 

The ISI is utilised in all system configurations requiring more than one SoPEC. An example of such a sys- 
tem which requires four SoPECs for diq>lex A3 printing and an additional SoPEC used as a storage device 
is shown in Figure 27. 

The ISI performs much the same function between an ISISlave SoPEC and the ISIMaster as the USB con- 
nection performs between the ISIMaster and the host. This includes the transfer of all program data, com- 
pressed page data and message (i.e. commands or status information) passing between the ISIMaster and 
I the ISISlave SoPECs. Existing requirements indicate that it is suflScient for the ISIMaster to initiate all 

communication with the ISISlaves. 

12.6.1 iSIMaster^SJSIave identification and ISISlave enumeration 

Section 12,5.1 details how a SoPEC is configured as an ISIMaster or ISISlave. The ISIId is established by 
software downloaded over the ISI (in broadcast mode) which looks at the input levels on a number of 
GPIO pins to determine the ISIId For any given printer that uses a multi-SoPEC configuration it is 
expected that there will always be enough fi«e GPIO pins on the ISISlaves to suppoit this enumeration 
mechanism. 

12.6.2 Wake-up from sleep mode 

Either the PrintMaster SoPEC or the host may place any of the ISISlave SoPECs in sleep mode prior to 
going into sleep mode itself. The ISISlave device should then ensure that its ISIWakeupEnable bit of the 
fVakeupEnabie Kgist&T (see Table 38) is set prior to entering sleep mode. In an ISISlave device the ISI 
block will continue to receive power and clock during sleep mode so that it may monitor the gpio_isi_4in 
I lines for activity. When ISI activity is detected during sleep mode and the ISIWakeupEnable bit is set the 

ISI asserts the isi^cpr_reset_n signal. This will bring the rest of the chip out of sleep mode by means of a 
wakeup reset. See chapter 16 for more details of reset propagation. 

12.6.3 ISI speed 

The ISI will need to run at speed that will allow eiror free transmission on the PCB while minimising the 
buffering and hardware requirements on SoPEC. While an ISI speed of 10 Mbit/s is adequate to match the 
effective USB 1.1 bandwidth it would limit the system performance when a high-speed connection (e.g. 
USB2.0, IEEE 13 94) is used to attach the printer to the PC. Although they would require the use of an extra 
ISI-Bridge chip such systems are envisaged for more expensive printers (compared to the low-cost basic 
SoPEC powered printers that are initially being targeted) in the future. 

An ISI line speed (i.e. the speed of each individual ISI wire) of 32 Mbit/s is therefore proposed as it will 
allow ISI data to be oversampled 5 times (at ^pclk frequency of I6OMH2). The total bandwidth of the ISI 
will defend on the number of pins used to implement the interface. The current expectation is that two 
pins will be used, giving a peak raw bandwidth of 64 Mbit/s. and this is the scenario that is used in this 
document. However the ISI protocol will work equally well if four pins are used for transmission/recep- 
tion and this would give a peak raw bandwidth of 128 Mbit/s. The number of pins available for the ISI is 
currently under investigation as part of the package selection process. With either a two or four pin ISI 
solution a 32 Mbit/s line speed would allow the movement of data in to and out of a storage SoPEC (as 
described in 12.3.4 above), which is the most bandwidth hungry ISI use, in a timely fashion. 

The maximum effective bandwidth of a two wire ISI. af^er allowing for protocol overheads and bus turn- 
around times, is expected to be approx. 50 Mbit/s. 
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12.6.4 ISl protocol 



The ISI is a serial interface utilizing a two wire half-duplex configuration as shown in Figure 28 below. An 
ISIMaster must always be present and up to 14 fSISlaves may also be on the ISI bus. The ISI bus enables 
broadcasting of data, ISIMaster to ISISIave communication, ISISlave to ISIMaster communication and 
ISISlave to ISISIave communication. Flow control, error detection and retransmission of errored packets is 
also supported. ISI transmission is asynchronous and a Start field is present in every transmitted packet to 
ensure synchronization for the duration of the packet. Bit-stufHng is required as it is expected that synchro- 
nization cannot be guaranteed for the length of the longest allowed packet^ Open Issue: This should be 
confirmed with the spec of the crystal used with SoPEC We may wish to constrain the spec of xtalin and 
also xtalin for the ISI-Bridge chip to ensure the ISI cannot drift out of sync during packet reception. 



ISIMaster 

SoPEC 

(ISIIdO) 

DO 
D1 
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rSISIave 
SoPEC #1 
(ISMdl) 

DO 
D1 






















ISISIave 
SoPEC #2 
(ISIId2) 

DO 
D1 






ISISIave 
SoPEC #3 
(ISIldS) 

DO 
D1 









Figure 28. ISI configuration with four SoPEC devices 

To maximize the eflfective ISI bandwidth while minimising pin requirements a two wire half-duqplex inter- 
leaved transmission scheme is used. Figure 29 below shows how a 16-bit word is transmitted from an ISI- 
Master to an ISISIave. Data is interleaved on a bit-by-bit basis over the two ISI lines and this requires all 
ISI packets to be an even number of bits in length* This interleaving could easily be extended to four pins 
if required 

All ISI transactions are initiated by the ISIMaster and every non-broadcast data packet needs to be 
acknowledged by the addressed recipient. An ISISIave may only transmit when it receives a ping packet 
(see section 12.6.4.5) addressed to it. To avoid bus contention all iSl devices must wait one bit-time (5 pclk 
cycles) aiter detecting the end of a packet before transmitting a packet (assuming they are required to 
transmit). All non-transmitting ISI devices must instate their Tx drivers to avoid line contention. A pull-up 
resistor is therefore required on both ISI lines to reduce the possibility of false data detection. The ISI pro- 
tocol is defined to avoid devices driving cut of order (e.g. when an ISISIave is no longer being addressed). 
As the ISI will use standard I/O pads there will be no physical collision detection mechanism. 



1. Current max packet size 290 bits = 145 bits per line (on a 2 wire ISI) = 725 160MHz cycles. Thus the pclks in the two communicat- 
ing ISI devices should not drift by more than one cycle in 725 i.e. 1379 ppm. Carefiil analysis of the crystal, PLL and oscillator specs 
and the sync detection circuit is needed here to ensure our solution is robust 
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DOln 

ISISlave 

DOOut 

OutpuiEnabfe(=0) 
DOtn 



J I- 

Figure 29. Half-duplex Interleaved transmlsslgn from ISIMaster to ISISlave 

There are three types of ISI packet: a long packet (used for data transmission), a ping packet (used by the 
ISIMaster to prompt ISISlaves for packets) and a short packet (used to acknowledge receipt of a packet). 
All ISI packets are delineated by a Start and Stop fields and transmission is atomic i.e. an ISI packet may 
not be split or halted once transmission has started. 

12.6,4.1 iSl transactions 

The difierent types of ISI transactions are outlined in Figure 30 below. As described later all NAKs arc 
inferred and ACKs are not addressed to any particular ISI device. 



ISIMaster 



ISISlave A 



ISISlave B 




Transaction 1: Long packet to an addressed ISISlave 



ISIMaster 



ISISlave A 



ISISlave B 




Transaction 2: Ping packet to an addressed ISISlave. ISISlave has nothing to send 
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ISfMaster 



ISISIave A 



(SIStave B 




JifsZ'^L'A iSn^^^^^^^^^ l^Jl'tl^^^^^^^ ^P--^^ a «ong packet to 



ISISIaveB and ISISlaveB responds with an ACK or NAK. 
ISIMaster ISISIave A 




ISISIave B 



J^Ki^^Z ^ w?^'*?^*^** addressed ISISIave. ISISlaveA responds with a long packet to 
the ISIMaster and the ISIMaster responds with an ACK or NAK. P-ciei lo 

Figure 30. ISI transactions 

12.6.4.2 Start field description and bit stuffing 

The Start field serves two purposes: To allow the start of a packet be unambiguously identified and to 
allow the receiving device synchronise to the data stream. The symbol, or data value, used to identify a 
.y/art field must not legiumately occur in the ensuing packet. Bit sniffing is used to guanmtee that the Start 
symbol w,ll be umque m any valid (i.e. error free) packet. ITie Start symbol should therefore be suffi- 
cenfly long to ensure tiiat the bit stuffing overhead is low but should still be short enough to reduce its own 
contribution to the packet overiiead. A Start bit length of 8 bits is therefore used as it is an effective com- 

riT!f!c??nfK-.*^*'* constraints. The Start field, like every byte in a packet, is transmitted with its 
ngntmost (Isb) bit first 

If the correct symbol value is used bit stuffing offers the further advantage of forcing transitions on tfie ISI 
lineswhich will allow synchronizatioii be maintained. Unfortunately a symbol value that is good for forc- 
mg transitions (e.g. 0x00) is not good for guaranteeing initial synchronization and vice versa i.e. a symbol 
such as OxAA would ensure initial synchronization but cannot prevent synchronization being lost if a long 
run ofzeroes or ones is subsequentiy transmitted. " 

Tb resolve this conflict the Start symbol will be OxAA and three different types of bit stuffing are used. 
Whenever OxAA is encountered in the data stream a 0 is inserted before the msb resulting in the 9-bit 
value 0X12A 0.e^ blOlOlOlO -> blOOlOlOlO). To ensure transitions occur during a long run of zeroes a 1 
IS mserted after 7 zeroes thus 0x00 becomes 0x080 (i.e. bOOOOOOOO -> bOlOOOOOOO). Likewise to ensure 
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SHu mt bTon l1 n?? 'tT ' becomes 0X17F (i.e. 

Sielii oL aii «!<L " ^ ""^"""^ """"^ -^"^ -d strip out fte 

^cKCt will be treated as an eirored packet. Furthermore if the Start field is not received as 0«aa 

Fran.eEm>r status bu is set and inconung data is discarded until a correct St^ SeSe'StS. 

In a ttuly random data such a bit sniffing scheme could cause an overhead of annrox 0 15% While the 

12.6.4.3 Stop ffetd description 

A 2-bit J/o/, field (= bl 1) is used to ensure that both lines return to die high state before the next nacket 
St'^ieSl^atv^linttTld*'^ '7 V' ^-^^£rS1^^^''^U^"t£ 

12.6.4.4 is/ long packet description 



b2 



See 




table 




below 






Payload 



CRC Stop 



L 



-L 



8 bits 3 bits 5 bits 



256 bits 



16 bits 2 bits 
Figure 31. fSI long packet 

All long packets begin with the Start field as described earlier. The PiaDesc field is described in Tabic 29. 
Table 29. PktPesc fielcf description 



Packet type indicator: 
1 - Short packet 

0 - Non«short (i.e, (onfl/ping) packet 
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Tabte 29. PktOesc field description 



Data paytoad present indicator 
1 - This packet contains payioad (I.e. long packet) 
0 ' This packet has no payioad 



Sequence bft value. Only vaJkJ for hng packets. See section 12.6.4.8 for a 
description of sequence bit operation 



Ajny ISI device in the system may transmit a long packet but only the ISIMastcr may initiate an ISI tnins- 
^n using a long packet^An ISISlave may only send a long packet in reply to a pLg message ^^T^e 

It^T r r^J"" w ""^y '^'^^'^ *° ISI device in the system afthoS 

ISIMaster (or the PnntMaster .fit is a different device) will be the usual recipient. 

Tht Addr^s field is straightforward and complies with the ISI naming convention described in section 

l^?^^^''^-^^^^ ^T!^ ^ ^^"^ ^'"ff^' transmitting ISI device and nets conied 

mto the receive buffer of the addressed ISI device(s).When present the payloi field is^t^s S b^ 

Z^jn.T'tT^T.l \^^' ''^'^'"^ ™^ CRC is calculated over the entin, packet 

(excluding the Start and Stop fields). The HDLC standard CRC-I6 (i.e. Grx) « +jc" + +J^ilZZ 
used for this calculation, which is to be performed serially x +jc +l) ,s to be 



12.6.4.5 ISl ping packet 



^52 i,*^"- " u ".'""I ^ ''""^ ISISlaves transmit on the ISI bus. As can be seen from Figure 32 
below the ping packet ,s cab be viewed as a special case of the long packet. In other words it ifTfonc 
j^ctet without ar^ payioad. whose Pk^Desc field is always bOOO and whose ISlSwTl^ji 1 S 
f nSl 'T"^ '° P''*^^ ^"^^ addressing the ISI de,Se rSi^Z onV^f 

?^e^^latTu rS '^f^^*- ""^ ISUdlsiSubld in resp^e iTT^es 

irferS^ NAK fif it Si «Aer an explicit ACK (if it has nodiing to send), an 

^dTJlSe ^ iLfiS MA^ ^^^''^ " P*^^*'* (containing the data it wished to 

Sa^tSie ^^^l^ ^ of ^ Ping packet This is because the 

ping packet wiU be retransmitted on a predetermmed schedule (see 12.6.4.10 for more details). 



b4 4 bits 1 bit 




8 bits 3 bits 5 bits 16 bits 2 bits 



Figure 32. ISI ping packet 

" ^ P''^S message to the broadcast ISIId as this must have been sent in 

error^An IS, pmg packet will never be sent in response to any packet and may o^, o^na^^rrm ^^TsS 
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12. 6.4. 6 ISI short packet description 



J3 



The ISI short packet is only 14 bits long, including the Start and Stop fields. A value of blOOl is pronosed 
for the ACK symbol. As a 1 6-bit CRC is inappropriate for such a short packet it is not used In fact diere is 
only one valid value for a 1 4-bit short ACK packet as the Start. ACK and Stop symbols aU have fixed val- 
ues^ Short packets are only used for acknowledgements (i.e explicit ACKs). The format of a short ISI 
packet IS shown m Figure 33 below. 



Start 


Ack 
Symbol 


Stop 


II i i 


8 bits 


4 bits 


2 bits 



Figure 33. Short ISI packet 

12.6.4.7 Error detection and retransmission 

The 16-bit CRC will provide a high degree of error detection and the probabUity of transmission errors 
occuirmg ,s very low as the transmission channel (i.e. PCB traces) will have a low inherent bit error rate 
J^r^T imdetected errors should therefore be minute. A simple retransmission mechanism frees 
the CPU from getting, mvolved m error recovery for most errors because the probability of a transmission 
error occurring more than once in succession is very, very low in normal circumstances. 
After each non-short ISI packet is transmitted the transmitting device will open a reply window The size 

hlfnl?™ Z ■ ^ If ^^S'-^- on t^o wires plus 2 bit times to aUow for 

bus tumanaunds and tumng differences) when a short packet is expected and 147 bit times (i.e 290 bits 
^rnitted on two wires plus 2 bit times to allow for bus turnarounds and timing differences) when a long 
packet IS expected m reply. 

When a Packet has been received without any errors the receiving ISI device must transmit its acknowl- 
edge packet (which may be either a long or short packet) before the reply window closes. When detected 
errors do occur the receiving ISI device will not send any response. The transmitting ISI device interprets 
this lack of response as a NAK indicating that errors were detected in the transmitted packet or that the 

^TnT"!^ 7"" "^^^^1° P"^^ ''^^ If » P«=l^« transmitted the 

tt^smimng ISI device wiU keep the transmitted packet in its transmit buffer for retransmission. If the 
IS the ISIMaster it will retransmit the packet immediately while if the transmitting 
device IS an ISISlave it will retransmit the packet in response to the next ping it receives from the ISIMa:^ 

The transmitting ISI device wiU continue retransmitting the packet when it receives a NAK umil it either 
receives an ACK or the number of retransmission attempts equals the value of the NumRetries register If 
ae transmission was unsuccessful then the transmitting device sets the TxError bit in its ISTStatus register 
The receiving device also sets the RxErwr bit in its ISIStcuus register whenever it detects NumRetries + 1 
errored pactets m succession. The NumRetries registers in all ISI devices should therefore be set to the 
sme value for consistent operation. Note that successfiil transmission or reception of ping packets do not 

^ Ki^m?'^*''' ^^P^" *^ "'■^ * Pa«=ket in error from 

an ISIS ave the NumRetnes coxmx will be reset if it subsequemly receives an error free packet from any ISI 
device (which may not be the ISISlave that transmitted the errored packet). Thus the RxError operation i 
L transactions as tiiese are the only ones where retransmissio^ wiU 

,«nw rir^i ir ^i "^^ °' "^"^^ implement a NmnRetriesCount window which would 

dlow all NAKs withm a ^cfied window to be counted. If NumRetries is exceeded within this window 
tnen we have a RxError othersvise we can reset the count. 
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Note that either a transmit or receive eiror will cause the ISI to stop transmitting or receiving i^sDectivelv 
CPU intervention mil be required to resolve the source of the problem and to restart Te ill tSLm^or 

'^ryi^:iJr;lzr''' ~ - ^^^^ 

NoK that bro«te.a ptckas are .ever aeknowledgtd to avoid concentioa on the common ISI lines If «, 



1Z6.4.8 Sequence bit operation 



To ensure that cormnunication between transmitting and receiving ISI devices is correctly ordered a 

P^"^"' *° ''^^P "^'^ '^^^"'^ ^ other. Sequence bits 

a« not used for short or pmg packets as they are not use^ 

ted sequence bit all ISI devices keep two local sequence bits, one for each ISISubld. Furtherm^^h ^ 
device maintains a transmit sequence bit for each ISIId and ISISubld it is in communication with For 
pack^.sourc<J from the host (via USB) the transmit sequence bit is contained in the relevai^t^SpJ^ 
/ ^^'X^^'^^ CPU the transmit sequence bit is contained in the 

T^^yif^^ -^ ' sequence bits for received packets are stored in DMAOSeaBit and 

r^jLt^^^T uT*^" transmitting and receiving ISI devices ^c^- 

rectly initialised each tune a new source is selected for any ISHd.ISISubld channel. 

bit on either of its ISISubld channels by setting the appropriate bit in the SequenceMask register The 
sequence b.t should be i^ored for ISISubld channels that will carry data that^J^S^o™ 
than one source and IS selfordering e.g. control messages. worn more 

i>l^o aS„t^if'''''H ''f^''^ by the ISISubld only when the receiver is 

able to accept data and receives an enor-free data packet addressed to it The transmitting ISI device will 
Se'^a^s'sSTsTLw^" ISIId.ISISubId channel only when it receives a valid AC^lands^rf™^ 

llf^^n^f''^^ ti^smission of two long packets with the sequence bit in both the transmitting and 
recenong devices toggling from 0 to 1 and back to 0 again. The toggling operation will continue iS Ss 
manner in every subsequent transmission until an error condition is encounter^ 
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ISI Device 
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0-> 1 
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0-> 1 
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— — ^^^^■^■i-^,^^ 








1 ->0 
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1 ->0 







Figure 34. Successful transmission of two long packets with sequence bit toggling 
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When the receiving ISI device detects an enor in the transmitted long packet or is unable to accept the 
packet (because of full buffers for example) it will not return any packet and it will not toggle its local 
sequence bit An example of this is depicted in Figure 35. The absence of any response prompts the trans- 
mitting device to retransmit the original (seq=0) packet. This time the packet is received without any errors 
(or buffer space may have been freed) so the receiving ISI device toggles its local sequence bit and 
responds with an ACK. The transmitting device then toggles its local sequence bit to a 1 upon correct 
receipt of the ACK. 



Transmitting 
ISI Device 




Receiving 
ISI Device 









0 








0->1 


s 


1 



Figure 3S. Sequence bit operation with errored long packet 

However it is also possible for the ACK packet from the receiving ISI device to be corrupted and this sce- 
nano is shown in Figure 36. In this case the receiving device toggles its local sequence bit to 1 when then 
long packet is received without error and replies with an ACK to the transmitting device. The transmittine 
device detects an error in the ACK packet and so will not change its local sequence bit. It then retransmits 
the seq=0 long packet. When the receiving device finds that there is a mismatch between the transmitted 
^uence bit and the expected Oocal) sequence bit is discards the long packet and replies with an ACK. 
When the transmitting ISI device correctiy receives the ACK it updates its local sequence bit to a 1 thus 
restonng synchromzation. Note that when the SequenceMask bit for the addressed ISISubId is set then the 
retransmitted packet is not discarded and so a duplicate packet will be received. The data contained in the 
packet should be self-ordenng and so the software handling these packets (most likely control messaiies) 
is expected to deal with this eventuality. 
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Figure 36. Sequence bit operation with ACK error 
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1Z6.4.9 Flow control 



lifjf R T'^^r '^^^ " '"^ ''"'"'y «nx.r in the received 

packet Because the SCB enjoys greater guaranteed bandwidth to DRAM than both the ISI and USB can 
supply flow control should not be required during normal operation. Any blockage on a DMA channel will 
soon result in the NumRetries value being exceeded and transmission to that DMA channel being halted 

^^r^^ ic!T"^ '° ^f^'' '° "'^^^^ P^ket neither the Lsmit: 

ting nor the receiving ISI device will be able to differentiate the cause of a Tx^r or RxEnvr. 

12.6.4.10 Auto-ping operation 

of the ISIMaster could send a ping packet by writing die appropriate header to the 
CPUISrrxBuffCntrl register it is expected that all ping packets will be generated in the ISI itself. The use 
of automatically generated p.ng packets ensures that ISISIaves will be given access to the ISI bus with a 
programmable mimmum ^«ranteed frequency in addition to whenever it is idle. Five registers fecilitate 
?^^?'^r*'TT,c^'' fn ®. '^r'^^'^ PingScheduteO. PingSchedulel. PingSchedule2. 

;"':?r/S5^^? ^ - --^^ - ^« 

^tna^l ^"^^^'fl^sScheduleNzcg^st^r corresponds to an ISIId that is used in the Addr^s field of 
the pmg packet and a I m the bit position mdicates that a ping packet is to be generated for that ISHd. A 0 
m any bit position wiU ensure that no ping packet is generated for that ISIId. As ISISIaves may differ in 
fte^rtendwid* requirement particularly if a storage SoPEC is present) three different PingSchedule rei- 

ISISlave. When the ISIMaster is not sending long packets (sourced from either the CPU or USB in the 

vT'?"^- ?t '^i^" ^ of PingScheduleO registJ and w^ork its way fr^ 

£l ^^r^ rr ^ «''f-^'^A«<*'^«^«P^ers. When the msb of PingSchedule2 is reach^r 
il"SJr^t:r '^^-^ «o each bit position of each Ping- 

^PEr^^t^nv,°/T'^^^°^''"^^ P°*^"^^ of packets in an ISIMaster 

oS'thL SlwL^ ?lf "^'^'^^-nlf'^'^u °" "SB for access to the ISI is handled 

^J l ^ T°!^ '^'^-^ arbitration between auto-ping packets and CPU/USB originating 
JicS^'JI^ K?^"' l." r P^''"'^' ^'"PP*'" ISI. To ensure that local pacS ge^ 

co.?„t? f *^ P"^''*'' guaranteed access to the ISI we use two 4- 

f^tet Th T^""*^**" IS initiated by the ISIMaster transmitting either a long packet or a ping 

bS^f nl^ /OTbr«/ftr.orf counter is decremented for every ISI transaction when cont^Son occurs (ii 
both a pmg and a local packet wish to transmit) while the ISILocaiPeriod counter is decremented for every 
local packet that is transmitted Neither counter is decremented by a retransmitted packet. 

Sie fs^flP^ ^ , ISILocaiPeriod registers. Local packets will always be given priority when 

H^ZZflT f f ^""'^ ISITomlPeriod and ISlLocalPe- 

r^^T^^- '^'^f P^'^"' /5/ro,./PenW counter hi 

Note that ping packets are quite likely to get more than their guaranteed bandwidth as they will be trans 

S'p :ctu :n t oilliif ' Si *'- ^"'^"^ "^^•^^^ decrrme'S^errtrer. 
oTcfet tSfmiZ / u T '^"^ guaranteed bandwidth because each local 
^'^^ '^^'^ counters. The difference between the values of the ISITotalPeriod 
^^'^^''^P^'-^'"^ "^g'Sters determines the number of automatically generated ping packets that ar* guaran- 
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teed to be transmitted every ISJTotalPeriod number of ISI transactions. If the ISITotalPeriod and ISlLo- 
calPeriod values are the same then the local packets will always get priority and could totally exclude oine 
packets if the CPU always bas packets to send. 

For example if ISITotaJPeriod = OxC; ISILocalPeriod = 0x8; PingScheduleO = 0x07; PingSchedulel = 
0x06 and PmgSckedule2 = 0x04 then four ping messages are guaranteed to be sent in every 12 ISI transac- 
tions. Furthermore ISIId3 wiU receive 3 times the number of ping packets as ISIdl and ISIld2 wiU receive 
twice as many as ISIdl. Thus over a period of 36 contended ISI transactions (allowing for two full rota- 
tions through the three PingScheduleN registers) when local packets are always pending 24 local packets 
will be sent, ISIdl will receive 2 ping packets, ISId2 will receive 4 pings and lSId3 will receive 6 ping 
packets. If local traffic is less frequent then the ping frequency will automaticaUy adjust upwards to con- 
sume all idle ISI bandwidth. 7 J K "a vvun 



12.6.4. 11 IISI Registers 



Table 30 beloW details the ISI configuration registers. Note that some of these registers are also used by 
other blocks in the SCB. 



Table 30. ISI configuration registers 

































0x00 


ISICntrl 


5 


0x2 


IS! Control register 


0x04 


ISIId 


4 


0x1 


ISIId for this SoPEC. A value of 0 Indicates the 
device Is an ISIMaster. Note that the SoPEC resets 
to being an ISISIave and that OxF (the broadcast 
ISIfd) Is an illegal value and should not be written to 
this register. 


0x08 


NumRetries 


4 


0x02 


Number of retransmissions to attempt in response to 
a NAK before aborting a long packet transmlssfon 


OxOC 


ISIPingScheduleO 


14 


0x0000 


Denotes which tSltds will be receive ping packets. 
Note that bftO refers to ISIIdl , Wtl to ISIId2..i)it13 to 
tS}ki14. 


0x10 


ISIPingScheduIel 


14 


0x0000 


As per PmgSchedutaO 


0x14 


ISlPingSchedule2 


14 


0x0000 


As per PingScheduleO 


0x16 


ISITotalPeriod 


4 


OxF 


Refoad value of the ISITotaiPeriod counter 


OxIC 


ISILocalPeriod 


4 


OxF 


Reload value of the ISILocalPeriod counter 


0x20 


(StStatus 


6 


0x00 


ISI Status register. This register Is Readonly. 


0x24 


ISIMask 


6 


0x00 


ISI Interrupt Mask register 


0x30 - 0x4C 


CPUISlTxBuff 


32 


n/a 


32-byte CPUISI transmit buffer 


0x50 


CPUISITxBuffCntri 


13 


0x0000 


Control register for the CPUISI transmit buffer 


0x60 • 0x7C 


CPUISIRxBuff 


32 


n/a 


32-byte ISI receive buffer. This is the half of the dou- 
ble buffer that contains the oldest data. 


OxBO 


ISIRxBuffDest 


1 


0x0 


Only one of the CPU and the DMA manager is 
altowed to empty the receive buffer at any time. 
1 B CPU will empty the receive buffer 
0 = DMA manager will empty the receive buffer 



12.6.4.11.1 ISI control register 

P'^i^'^"^^ described in Table 31 below. Note that the reset value of this register allows the 
^""^^^ automatically become an ISIMastcr (AutoUasterEnable « 1) if any USB packets are received on 
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endpoints 2-4. On becoming an ISIMaster the /i7W register is set to 0. the TxEnable bit of the IS fCntrl tee- 
ister IS set and any USB or CPU packets destined for other ISI devices are transmitted. The CPU can over- 
ride this capabihty at any time by clearing the AutoMasterEnable bit. Automatic ping operation can only 
be enabled by the CPU as the reset values of the PingScheduteNngisten are all 0 and neither DMA dan- 
nel is automatically configured. 

Table 31. ISICntrl register 













EnabJes ISi transmission of long or ping packets. This is cleared by 
transmit errors and so needs to be restarted by the CPU. Note that 
ACKs may stirl be transmitted when this bit is 0, 
1 a Transmission enabled 
0 s Transmission disabled 


RxEnable 


1 


Enables ISi reception. This Is deared by receive errors and so 
needs to be restarted by the CPU. 
1 s Reception enabled 
0 s Reception disabled 


AutoPingEhable 


2 


Enables auto-ping operation 
1 = auto-ping enabled 
0 = auto-pIng disabled 


AutoMasterEnable 


3 


Enables the device to automaticaliy become the IStMaster If active 
ily Is detected on USB endpoints2-4. 
1 = auto-:master operation enabled 
0 s auto-master operation disabled 



12.6.4.11.2 



ISI status register 

The ISIStatus register is read-only to the CPU. Status bits are set by the relevant condition occurring and 
are cleared by wnting to either the TxEnable or RxEnable bits of the ISICntrl register or the CPUfSITx- 

Buff: 

Table 32. ISIStatus register 









FrameError 


0 


Raming error detected In the received packet. This can be caused 
by an inconrect Scarf or Stop field or by bit stuffing errors 


RxError 


1 


A CRC en^or or flow control .condition was detected in NumRe- 
tries^-i successive packets (exduding ping packets) 


RxBuffPuN 


2 


There is no space remaining in the receive double buffer 


RxBuffOverflow 


3 


An overflow has occurred In the ISI receive buffer and a packei had 
to be dropped. 


CPUlSrrxBuffEmpty 


4 


The CPUlSITxBuff Is empty 


TxEnror 

1 


5 


Transmission error. Receiving ISI device would not accept the 
transmitted packet. Only set after NumRetries unsuccessful 
retransmissions (excluding ping packets). 
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12.6.4.1 1.3 ISI mask register 

An interrupt will be generated in an edge sensitive manner i.e the ISI will generate an isijcujrq pulse 
each time a status bit goes high and the corresponding bit of the ISIMask register is enabled. 

Table 33. ISIMask register 









Prametrronnttn 


0 


Interrupt enable mask bit for the FrameError status bit 


RxEnorlntEn 


1 


Interrupt enable mask b\X for the RxError status bit 


RxBuffFuBlntEn 


2 


Intemipt enable mask bit for the RxBuffFull status bit 


RxBuffOverflowlntEn 


3 


Interrupt enable mask bit tor the RxBuffOverflow status bit 


CPUISrrxBuffEmpty- 
IntEn 


4 


Interrupt enable mask bit for the CPUISITxBuffEmpty status bit 


TxEiTorlntEn 


5 


Interrupt enable mask bit for the TxError status bit 



12.6.4.11.4 CPUISlTxBuffCntrl register 

The CPUISlTxBuffCntrl register contains the header field for the packet in the CPUISI transmit buffer. 
Writing to this buffer validates the contents of the CPLflSI transmit buffer i.e. each time the CPU places a 
packet in the CPUISI transmit buffer it must write the packet header to this register to initiate its transfer in 
to the SCB transmit buffer (see section 12.7). Note that the CPU is responsible for toggling the sequence 
bit of any long packets it wishes to transmit. The CPUISITxBuffEmpty status bit will be set when CPUTx- 
PktSize bytes have been transferred to the SCB transmit buffer. 

Table 34. CPUISfTxEuffCntrl register 









I'KtDesc 


2:0 


PktDesc field (as per Table 29) for the packet currently in the CPU- 
ISI transmit buffer. 


OestlSISubId 


3 


Indicates whfoh DMAChannel of the target SoPEC the data in the 
CPUISI transmit buffer is destined for: 

0 s OAMChannelO 

1 e DMAChanneil 


OesttSlld 


7:4 


Denotes the ISIId of the target SoPEC as per Table 35 



12.7 SCB Mapping 



In order to support maximum flexibility when moving data through a multi-SoPEC system it is possible to 
map any USB endpoint onto either DMAChannel within any SoPEC in the system. A logical view of the 
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SCB is shown in Figure 37. This view di£Fers from the likely implementation but it allows for a clearer 
depiction of data movement within the SCB. 



SCB 



CPUISf 
TxBuffer 



USB 
Host 



USB 

ControNer 



SCB 

Control 

Block 



CPU Subsystem Bus 



SCB 
TxBuffer 



SCB 
Map 



DMA 
Manager 



ChannetQ 



Chjnnal\ 



Rx 



iSI 



CPU 



DIU 



isi din 



IsLdout 



Figure 37. SCB logical view 

The SCB map, and indeed the SCB itself is based around the concept of an ISIld and an ISISubId Each 
SoPEC in the system has a unique ISIId and two ISISublds. namely ISISubldO and ISISubldl We use the 
convention that ISISubldO corresponds to DMAChannelO in each SoPEC and ISISubldl corresponds to 
DMAChannelL The naming convention for the ISOd is shown in Table 35 below and this would corre- 
spond to a muIti-SoPEC system such as that shown in Figure 27, We use the term ISIId instead of SoPE- 
CId to avoid confusion with the unique CWpID used to create the SoPEC id and SoPECid key fsee 
chapter 1 7 and {9] for more details). " ^ y 

Table 35. ISIId naming convention 







0 


ISIMaster (typically a SoPEC oonnectad to the host via US81.1) 


1 -14 


ISISIave1-14 


15 


Broadcast ISIId 



Ax^r liMauDiQ mererore allow us to address any DMAChannel in the system. The ISI 

DMA manager and SCB map hardware use the ISIId and ISISubId to handle the different data streams that 
are active m a multi-SoPEC system as does the software running on the CPU of each SoPEC In this docu- 
ment we will identify DMAChannels as ISlx.y where x is the ISIId and y is the ISISubId. Thus ISI2 1 
refers to DMAChannel 1 of ISISlaveZ Any data sent to a broadcast channel, i.e. ISIIS.O or ISI15 1 are 
received by every ISI device in the system including the ISIMaster (which may be an ISI-Bridge). 
The USB controller and software stacks however have no understanding of the ISUd and ISISubId but the 
Silverbrook pnnter driver software running on the host PC does make use of the ISIId and ISISubId USB 
IS simply used as a data transport - the mapping of USB endpoints onto ISIId and Subid is communicated 
from the host PC Silverbrook code to the SoPEC Silverbrook code through USB control (or possibly bulk 
dataj messages i.e. the mapping information is simply data payload as far as USB is concerned The code 
runmng on SoPEC is responsible for parsing these messages and configuring the SCB accordingly 
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The use of just two DMAChaimels places some limitations on what can be achieved without software 
intervention. For cveiy SoPEC in the system there are more potential sources of data than there are sinks. 
For example an ISISlave could receive both control and data messages from the ISIMaster SoPEC in addi- 
tion to control and data from the host, either specifically addressed to that particular ISISlave or over the 
broadcast ISI channel. However all ISISlaves only have two possible data sinks, i.e. the two DMAChan^ 
nels. Another example is the ISIMaster in a multi-SoPEC system which may receive control messages 
from each SoPEC in addition to control and data information from the host (e.g. over USB). In this case all 
of the control messages are in contention for access to DMAChannelO. We resolve these potential conflicts 
by adopting the following conventions: 

1) Control messages may be interleaved in a memory buffer: The memory buffer that the 
DMAChannelO pomts to should be regarded as a central pool of control messages. Every control 
message must contain fields that identify the size of the message, the source and the destination of 
the control message. Control messages may therefore be multiplexed over a DMAChannel which 
allows several control message sources to address the same DMAChannel. Furthermore, if SoPEC- 
type control messages contain source and destination fields it is possible for the host to send control 
messages to individual SoPECs over the ISI! 5.0 broadcast channel. 

2 ) Data messages should not be interleaved in a memory buffer: As data messages are typically 

part of a much larger block of data that is being transferred it is not possible to control their contents 
m the same manner as is possible with the control messages. Furthermore we do not want the CPU 
to have to perfomi reassembly of data blocks. Data messages from different sources cannot be inter- 
leaved over the same DMAChannel - the SCB map must be reconfigured each time a different data 
source is given access to the DMAChannel. 

3 ) Every reconfiguration of the SCB map requires the exchange of control messages: The only 

active SCB map in a multi-SoPEC system is the SCB map in the ISIMaster as all ISISlaves auto- 
matically send data addressed to themselves to either DMAChannelO or 1 i.e. the ISI is the only 
source of incoming data in an ISISlave. The ISIMaster's SCB map reset state is shown in Figure 39 
and any subsequent modifications require the exchange of control messages between the ISIMaster 
•and the host As the host is expected to control the movement of data in any SoPEC system it is 
antiapated that all changes to the SCB map will be performed in response to a request from the 
host. While the ISIMaster could autonomously reconfigure the SCB map (this is entirely up to the 
software running on the ISIMaster) it should not do so without informing the host in order to avoid 
data being misrouted 

An example of the above conventions in operation is worked through in section 12.7.2. 

12.7-1 Host PC to ISIMaster SoPEC communication 

When considering SCB map configurations we always assume that the ISIMaster is a SoPEC device in 
particular the SoPEC connected to the USB bus (and receiving data on USB endpoint 2, 3 or 4), rather than 
an ISI-Bndge chip. ISI-Bridge chips are likely to have something similar to an SCB map and the following 
mfoimation should broadly apply to an ISI-Bridge but we focus here on an ISIMaster SoPEC for clarity. 
As the ISIMaster SoPEC represents the printer on the USB bus it is required by the USB specification to 
have a dedicated control endpoint, EPO. At boot time the ISIMaster SoPEC will also require a bulk data 
endpoint to facilitate the transfer of program code from the host PC. The simplest SCB map configuration 
I.e. for a single stand-alone SoPEC, is sufficient for host to ISIMaster SoPEC communication and is showri 
in Figure 38. In this configuration all USB control information exchanged between the host and SoPEC 
over EPO (which is the only bidirectional USB endpoint). SoPEC specific control information (printer sta- 
tus, DNC info etc.) is also exchanged over EPO. 

All packets sent to the host from SoPEC over EPO must be written into the EPO FIFO by the CPU All 
packets sent from the host to SoPEC can be placed in DRAM by the DMA Manager (as is usually the 
case) or read directly by the CPU. This asymmetry is because in a multi-SoPEC environment the CPU will 
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need to examine all incoming control messages (i.e. messages that have arrived over DMAChannelOl to 
ascertain their source ^d destination (i.e. they could be from an ISISlave and destined for the host) and so 
the adc^tional overhead m having the CPU move the short control messages to the EPO FIFO is relativelv 
smaU. Furthermore we wish to avoid making the SCB more complicated than necessary, particularly whe^ 
there is no sigmficant performance gain to be had as the control traffic will be relatively low band Jdth. 

•nie above nicchanisms are appropriate for the types of communication outlined in sections 12 4 11 
cnrougn 12.4.1.4 - - .* 
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Figure 38. Single SoPEC SCB map configuration and dataflow 
12.7.2 Broadcast communication 

il'^n^fZ"^^^r ^"^^^ communication is shown in Figure 39. This particular configuration is 
also the default, post power-on reset, configuration for the ISIMaster SoPEC. USB endpoints EP2 and EP3 

^'l'ri1r°;T""'?l^^ (the broadcast ISIId chamiel). So is u^Sr con- 

trol messages as before and EPl ,s a bulk data endpoint for the ISIMaster SoPEC. Depending on what is 

.'TL o^f*, H V r compressed page or other program downloads later For this reason 

2,^^ m.!^ J' f ^ 'Ws setup the USB device configuration will take place, as it 

always must, by exchanging messages over the control chaimel (EPO). 

One possible boot mechanism is where the host PC sends the bcotloaderl program code to all SoPECs bv 

™ '•Sfl'.lM T Tpp^'1'°'^P " ^'^ '^""^ ^'"^ authenticates anS ex^utes the bootlo^rS 
The ISIMaster SoPEC then polls each ISISlave (over the ISIx.O chamiel). Each ISISlave ascer^ins 
.te ISIId by samplmg the particular GPIO pins required by the bootloaderl and reporting its presence and 

boA the host and the ISIMaster have knowledge of the number of SoPECs. and their ISIlds, in the systeT 

^lETs^stl^^l^r^fr' f ^""^^ "^"'^^ ^P'^™^ Particular mul" 

fsf o™ could mvolve simphfymg the default configuration to a single SoPEC system (Figure 

38) or remappmg the broadcast channels onto DMAChannels in individual ISISlaves. 
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Figure 39, Default SoPEC SCB map configuration and dataflow 

'^e^f^^:ZTo^tn.^ "^"^ "^^'^ "^''^'^ 

1) The host PC sends a control inessage(s) to the ISIMaster SoPEC requesting that USB EP3 be 
remapped to ISILO 

2 )The ISIMaster SoPEC sends a control message to the host PC infonning it that EP3 has now been 
SSe Though eS). *^ "^""^ ' "° '^-^^^ 

^''^nT^Ttl^TsoS^' messages directly to ISISlavel without requiring any CPU int^ven- 

12.7.3 Host PC - ISISIave SoPEC communication 

The defeult post-boot (as opposed to post-reset) SCB map configuration for an ISISIave SoPEC is to have 
S« H,lrf?r''nZ*'""r!r*- automatically forwards any data addressed to it (including broad- 

cast data) to the DMA with the appropriate ISISubld. If the ISIMaster is configured correctly (eg when 

^2^T^M^^^' ^t'" ^'''^Z' -"^P ^""^^"-'^ f^mthe host 

destined for an ISISIave will be transmitted on the ISI with the correct address. If the ISISIave has data to 
send to the host ,t must do so by sending a control message to the ISIMaster identifying the host as the 
mtendcd recipient It is then the ISlMaster's responsibility to forward this message to the host. 
With this configuration the host can communicate with the ISIsiave via broadcast messages only and this 
. "T^rr ^ '^'1!''' bootloaderl program is downloaded. The ISISIave is miable to communi- 
^^'^^fu . T T bootlloaderl program has successfully executed and Ae 

ISISIave has determmed what its ISIId is. After the bootloaderl program (and possibly other prognmis) 
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has executed the SCB map of the ISIMaster may be reconfigured to reflect the most appropriate topoloev 
for the particular multi-SoPEC system it is part of. ej^ 

AU communication from an ISISlave to host is achieved by sending messages via the ISIMaster The 
ISISIave can never mittate communication to the host. If an ISISlave wishes to send a message to the host 
It may do one of two things: (a) wait until it is polled by the ISIMaster or (b) indicate in its ISI acknowl- 
edgement packet (sent m response to the reception of an ISI packet specifically addressed to that ISISIave) 
that It has a message to send When the ISIMaster receives the message from the ISISIave it first examines 
It to determine the mtended destination and wOl then copy it into the EPO HFO for transmission to the 
host. The software tunning on the ISIMaster is responsible for any ari>itration between messages from dif- 
ferent sources (mcluding itselQ that are all destined for the host. 

TJe above mechanisms are appropriate for the types of communication outlined in sections 12,4,1.5 and 

12,7.4 ISIMaster - ISISIave communication 

AU ISIMaster - ISISIave communication takes place over the ISI. Immediately after reset this can only be 
by means of broadcast messages. Once the boptloadcrl program has successfidly executed on all SoPECs 
m a multi-SoPEC system the ISIMaster can communicate with each SoPEC on an individual basis. 
If an ISISIave wishes to send a message to the ISIMaster it may do so in response to a ping packet from the 
ISIMaster. When the ISIMaster receives the message from the ISISIave it must interpret the message to 
determine if the message contains infomiation required to be sent to the host. In the case of the ISIMaster 
being a SoPEC. software wUI transfer die appropriate information into the EPO FIFO for transmission to 
the host. 

ThQ above mechanisms are appropriate for the types of communication outlined in sections 1 2.4.2 3 and 
12.4.2.4. 



12.7.5 ISISIave - ISISIave communication 

ISISlave to ISISIave communication is expected to be limited to two special cases; (a) when the PrintMas- 
ter IS not the ISIMaster and (b) when a storage SoPEC is used. When the PrintMaster is not the ISIMaster 
then It wUI need to send control messages (and receive responses to these messages) to other ISISlaves 
When a storage SoPEC is present it may need to send data to each SoPEC in the system. All ISISIave to 
ISISIave commumcation will take place in response to ping messages from the ISIMaster. 

12-7.6 SCB Map configuration registers 

The SCB map is configured by mapping a USB endpoint on to a data sink. This is performed on a endpoint 
basis i.e each enc^oint has a configuration register to allow its data sink be selected. Mapping an endpoint 
on to a data sink does not initiate any data flow - each enc^oint/data sink needs to be enabled by writing to 
the ^propnatc configuration registers in the USB controller/ ISI / DMA manager 



Table 36. SCB Map conffguration registers 





USBEPODesi 



USBEPIDest 



USBEP2Dest 



0x20 



0x21 



0x3E 



This register determines which of the data sinks the 
data arriving in EPO should be routed to. 



Data sink for USB EP1 



Data sink for USB EP2 
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Table 36. SCB Map configuration registers 




JJk. '^*T^''?o,t^^ ^''^ USBEPnDest configuration registers and is described in 

BMK M^" '■''J' map to i^fy data^at should be rSSd to t^e 

.^tifJ^nM^rr °' T^nfJ- *° special fieli to 

mflppnn f.^ ^ '^^^ <^ of by the SCB hardware. "She 

f ^'^'^r, programmed with Oxio and 0x21 (for IsS^^l 
ISiai) re^ectively to ensure data amvmg on these en<%>oints is moved directly to DRAM, 

Table 37. USBEPnDest register 











0 


Indicates which DMAChannel of the target SoPEC the endpoint is 
mapped onto: 

0 £ DMAChannefQ 

1 = OMAChannell 


DestJSI{d 


4;1 


Denotes the ISlid of the target SoPEC as per Table 35 


ChannefEn 


5 


Enable bit for the DMAChannel: 

0 = Channel dfsabfed 

1 = Channef enabled 


SequenceBtt 


6 


Sequence bit for packets going from USBEPn to OestlSlld DesU- 
SISubld. Every CPU write to this register initialises the value of the 
sequence bit and this is subsequently updated by the ISI after 
every successful long packet transmission. 



L^m J? SSi?^ f*"^? " endpoims, under the control of the host, as are required for 

themult-SoPEC system .t ^s part of. As already mentioned this mapping may be dy;amicalirr^nfig- 

12.7.7 SCB transmit buffer arbitration 

When the SCB transmit buffer has been emptied the SCB control logic wUI immediately seek to ,^fii. 

saxy to arbitrate between these data sourees. This arbitration is controlled by the SCBTxBZbr^^^tr 
which contains a high priority bit for both the CPU and the USB. If only one S tlSe W^"f1ef 
s^ST'S"' T^o^rT' P"""'^- ^•''^ '"^^ CPU given aLX^^L^e^tTe JsB tl 
basS SCBTxBuffArb have the same value then arbitration will take place on a round robin 

W Jfls'^t^f .ril,"' 5* f <^ ""^"^ """^^ '^"P''*^ is at least 5 times greater than it can be filled 

by USB traffic the double buffers used for each USB endpoint will not overflow using the abov^Tcheme in 

r:^i>z^Z;Jt"L%vr''' ^ endp*intri„g trp^ 

ranly blocked such as the CPU having pnonty. retransmissions on the ISI bus. channels beine enabled (cf 
the CHannelEn b.t of the USBEPnDest register) with data already in their aisocS eXirPlFoior 
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short packets being sent on the USB. Care should be taken to ensure that the USB bandwidth is efficientlv 
utihsed at all times. ^ 



12.7.8 SCB Control Block 



I^t Sf. =0'^^°' Wock IS responsible for coordinating access to and between the various sub-blocks in the 
SCB. This includes translating between the CPU subsystem bus and the USB native bus protocol movine 
date from the USB endpoint FIFOs into the SCB transmit buffer, moving data from the CPUISl'transmh 
buffer mto the SCB transmit buffer and arbitrating between the CPU and itself for access to the SCB sub- 

DlOCKS. 



Tabfe 38. SCB control block configuratfon registers 





^^^^^^^^^ 






0x120 


WakeupEnatife 


2 


0x0 


This register is used to gate the propagation of the 
USB and ISI reset signals to the CPR block. Active 
high. 

WakeUpEnaWelOJ: c/stLcpr./©seL/» control 
WakeUpEnablehl: /sLq^c/esef.n control 


0x124 


SCBTxBuffArb 


2 


0x0 


Determines which source has priority when conten- 
tion arises in filling the SCBTxBuffer, When a bit is 
set priority is given to the relevant source. 
SCBTxBuffArb[OJ: CPU priority 
SCBTxBuffArt[1]: USB priority 


0x128 


SCBDebugSel 


10 


0x000 


Contains address of the register selected for detxjg 
observation as it would appear on cpiJ_adr(1 1 :2] 
The contents of the selected register are output in 
the scb_cpu_aata bus while cpu_scb_sel is low and 
scb_cpu_<f0bug_valid is asserted to Indicate the 
debug data is valid. 

It is expected that a numtjer of pseudo-registers will 
be made available for debug observation and these 
wiO be outlined with the Implementation details 



12.8 DMA Manager 



T^e DMA manager manages the flow of data between the SCB and the embedded DRAM. Whilst the 
CPU could be used for the movement of data in a USB 1 . 1 enabled SoPEC a DMA manager is a more effi- 
cient solution as It wiU handle data in a more predictable fashion with less latency and requiring less buff- 
rSI/"^!'!"*'"^ manager is required to support the ISI transfer speed and to ensure that the 

SoPEC could be used with a high speed ISI-Bridge chip in the future. 

The JJI^ msmager uses two independent channels, one for each ISISubld. to control the movement of 
oata. Both DMAChannels only support write operation and can transfer data from any USB endpoint and 
irom the ISI receive buffer. Data is moved at the soonest opportunity to do so and is always moved in 256- 
bit slices as required by the DIU. When it is not possible to use a 256-bit slice of data (e.g. at the end of a 
packet or for a short packet) the DMA manager will still use 256-bit access to the DIU. This means that for 
1^ incoming to the SoPEC) the DMA manager will pad the valid data with zeroes until a 
256-bit slice has been filled. 

The DMA manager handles all issues relating to byteAvord/longword address alignment, data endianness 
and transaction scheduling. It arbitrates between data arriving from. the ISI and data arriving ftom a USB 
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endpoint on a round robin basis. The greater guaranteed bandwidth available to the DMA manager (50 
Mbit/s at the time of writing but this may need to be increased especially if a 4- wire ISI bus is used. See 
section 20,6 for more details) ensures that the DMA manager is non-blocking. 

While the DMA manager performs the work of moving data the CPU controls the destination and relative 
timing of dataflows to and from the DRAM. The management of the DRAM data buffers requires the CPU 
to have accurate and timely visibility of both the DMA and PEP memory usage. In other words when the 
PEP has completed processing of a page band the CPU needs to be aware of the fact that an area of mem- 
oiy has been freed up to receive incoming data. The management of these buffers may also be perforaied 
by the host. 



12.8.1 Circular buffer operation 



The DMA manager supports the use of circular buffers for both DMAChanneb. Each circular buffer is 
controlled by 5 registers: DMAnBottomAdn DMAnTopAdr, DMAnMaxAdn DMAnCurrWPtr and DMAnln- 
tAdr. The operation of the circular buffers is shown in Figure 40 below. 




^ DMAnTopAdr 
^ DMAnlntAdr 



4— DMAnCurrWPtr 



•'i"-5|-il --Lfe 



p DMAnMaxAdr 
DMAnBottomAdr 



(a) 

Key: pree buffer space 



FiUed buffer space (unprocessed data) 



DMAnTopAdr 




DMAnMaxAdr 



4— DMAnlntAdr 



^ DMAnCurrWPtr 



4— DMAnBottomAdr 



(b) 



^s^^^j Buffer space filled since last write to the DMAnlntAdr/DMAnMaxAdr registers 

Figure 40. Circular buffer operation 

Here we see two snapshots of the status of a circular buffer with (b) occurring sometime after (a) and some 
CPU writes occurring in between (a) and (b). These CPU writes are most likely to be as a result of a fin- 
ished band interrupt (which frees up buffer space) but could also have occurred in a DMA intem^Jt service 
routine resulting from DMAnlntAdr being hit. The DMA manager will continue filling the free buffer 
space depicted in (a), advancing the DMAnCurrWPtr after each write to the DIU. Note that the DMACur- 
rWPtr register always points to the next address the DMA manager will write to. When the DMA manager 
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reaches the address in DMAnfntAdr (i.e. DMACurrWPtr = DMAnlntAdr) it will generate an interrupt if the 
DMAnlntAdrMask bit in the DMAMask register is set. The purpose of the DMAnlntAdr register is to alert 
the CPU that data (such as a control message or a page or band header) has arrived that it needs to process. 
The interrupt routine servicing the DMA interrupt will change the DMAnlntAdr value to the next location 
that data of interest to the CPU will have arrived by. 

In the scenario shown in Figure 40 the CPU has determined (most likely as a result of a finished band 
intemipt) that the filled buffer space in (a) has been freed up and is therefore available to receive more 
data. The CPU therefore moves the DMAnMaxAdr to the end of the section that has been freed up and 
moves the DMAnlntAdr address to an appropriate offset from the DMAnMaxAdr address. The DMA man- 
ager continues to fill the free buffer space and when it reaches the address in DMAnTopAdr it wraps around 
to the address in DMAnBottomAdr and continues from there. DMA transfers will continue indefinitely in 
this fashion until the DMA manager reaches the address in the DMAnMaxAdr register. 

The circular buffer is initialised by writing the top and bottom addresses to the DMAnTopAdr and DMAn- 
BottomAdr registers, writing the start address (which does not have to be the same as the DMAnBottomAdr 
even though it usually will be) to the DMAnCurrWPtr register and appropriate addresses to the DMAnln- 
lAdr and DMAnMaxAdr registers* The DMA operation will not commence imtil a 1 has been written to the 
relevant bit of the DMACkanEn register. 

While it is possible to modify the DMAnTopAdr and DMAnBottomAdr registers after the DMA has started 
it should be done with caution. The DMAnCurrWPtr register should not be written to while the 
DMAChannel is in operation. DMA operation may be stalled at any time by clearing the appropriate bit of 
the DMACkanEn register or by disabling an SCB mapping or ISI receive operation. 

12.8.2 DMA manager DRAM bandwidth requirements 

The DIU must guarantee the SCB enough bandwiddi to ensure that neither a USB endpoint FIFO nor the 
ISI receive buffer can overrun. For exan^)le, to facilitate bursty 32 Mbit/s transfers a SoPEC with a 64- 
byte ISI receive buffer would need to be able to transfer 256 bits every 1280 cycles (@160 MHz). This is 
in addition to the USB transactions targeted at the ISIMaster SoPEC which may be in the region of 8-9 
Mbit/s. While USB has a backpressure mechanism SoPEC should strive to obt2un optimum USB band- 
width utilization and so USB backpressuring should only be iised as a last resort The DIU cunently guar- 
antees 50 Mbit^s to the SCB and more bandwidth will be available when other DIU requestors do not take 
their slots* This is sufficient for the SCB*s requirements. 

12.8.3 DMA manager configuration registers 

All of the circular buffer registers are 256*bit word aligned as required by the DIU. The DMAnBottomAdr 
and DMAnTopAdr registers are inclusive i.e. the addresses contained in those registers form part of the cir- 
cular buffer.The DMAnCurrWPtr always points to the next location the DMA manager will write to so 
interrupts are generated whenever the DMA manager reaches the address in either tiie DMAnlntAdr or 




Doc: SoPEC_hardware_design S3 Proprietary Document 29 Nov 2002 

Version: 2.3 — Page 131 



SoPEC : Hardware Design 



DMAnMaxAdr registers rather than when it actually writes to these locations. It therefore cannot write to 
the location in the DMAnMaxAdr register. 



Table 39. DMA Manager Configuration Registers 




0x0.0000 




The 256-bit aligned DRAM address of the 
bottom of the drcular buffer serviced by 
OMAChannelO 



0x204 



DMAOTopAdr 



17 



0x0.0000 



The 256-bit aligned DRAM address of the 
top of the circular buffer serviced by 
DMAChanneK) 



0x206 



DMAOCurrWPtr 



17 



0x0^0000 



The 256-kMt aligned ORAM address of the 
next location DMAChannetO wfll write to. This 
register is set t>y the CPU at the start o1 a 
DMA Operation and dynamically updated by 
the DMA manager during the operation. 



0x20C 



DMAOtntAdr 



17 



0x0.0000 



The 256-bit aligned DRAM address of the 
location that will trigger an Interrupt when 
reached by DMAChanneK) buffer. 



0x210 



DMAOMaxAdr 



17 



0x0.0000 



The 256-tjit aligned DRAM address of the 
last free location in the DMAChanneK) circu- 
lar txiffer The OMAChannelO transfers will 
stop when It reaches this address. 



0x214 



DMAOSeqBtt 



0x0 



Sequence bit for OMAChannelO. This bit may 
be Initialised by the CPU but Is updated fay 
tfie IS) each time an enor-free long padcet Is 
received. 



0x218 



DMAIBottomAdr 



17 



0x0.0000 



The 256-falt aligned DRAM address of the 
bottom of the circular buffer serviced by 
DMAChannell 



0x21 C 



DMAlTopAdr 



17 



0x0.0000 



The 25643it cdtgned ORAM address of the 
top of the circular tMiffer serviced by 
OMAChanneh 



0x220 



OMAlCuoWPtr 



17 



0x0.0000 



The 2S6-bit aligned DRAM address of the 
next location DMAChannell will write to. This 
register is set by the CPU at the start of a 
DMA operation and dynamically updated t>y 
the DMA manager during the operation. 



0x224 



DMAHntAdr 



17 



0x0.0000 



The 256-blt aligned DRAM address of the 
location that will trigger an interrupt when 
reached t>y DMAChannell buffer. ' 



0x228 



DMAlMaxAdr 



17 



0x0.0000 



The 256-lxt aligned ORAM address of the 
last free location in the DMAChannell circu- 
lar buffer. The DMAChannell transfers win 
stop when K reaches this address. 



0X22C 



DMAlSeqBit 



0x0 



Sequence bit for DMAChannell . This bit nnay 
be initialised by the CPU but is updated t>y 
the ISI each time an error-free long padcet Is 
received. 



0x230 



DMAChanEn 



0x0 



Enat>le DMA operation on a per ctiannel 
t»asi8. Active high. 

OMAChanEn[0]: Enable DMAChanneK) 

DMAChanEn[13: Enable DMAChannell 
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Table 39. DMA Manager Configuration Registers 







m 






0x234 


DMAStatus 


4 


0x0 


DMA status register. See section 12.8^.1. 
This register is Readonly. 


0X238 


DMAMask 


4 


0x0 


DMA mask register. See section 12.8.3.2 



12.8.3.1 DMAStatus register 

The contents of the DMAStatus register are read-only to the CPU, The status bits are not sticky bits i.e. 
they reflect the 'live' status of the channel. Status bits may only be cleared by writing to the relevant 
DhdAnlruAdroiDMAnMaxAdrte^et, 



Table 40. DMA Status Register 







HSMfflilB^iMIll; lililFfi 


DMAChannerointAdrHit 


0 


DMAChannetO has reached the address contained in the 
DMAOfntAdr register 


OA4AChannel0MaxAdrHit 


1 


OMAChannelO has reached the address contained in the 
DMAOMaxAdr register 


OMAChannell IntAdrHH 


2 


OMAChannell has reached the address contained in the 
DAM 7//iMd^ register 


OMAChannell MaxAdrHit 


3 


OMAChannell has reached the address contained in the 
OiflMrMaxi4dr register 



1Za.3.2 DMAMask register 

All bits of the DMAMask are both readable and writable by the CPU. The DMA manager cannot alter the 
value of this register. All inteiiupts are edge sensitive Le the DMA manager will generate a dmajcu^irq 
pulse each time a status bit goes high and the corresponding mask bit is enabled. 



Table 41. DMA Manager Mask Register 





1^ 




OMAChannelOlntAdrHitMask 


0 


1 - Generate an interrupt when the DMAChannelOlntAdfKit status 
bit goes high 

0 s not generate an interrupt when the DMAChannelOlntAdrHit 
status bit goes high 


Diy^AChannelOMaxAdrHitMask 


1 


1 « Generate an interrupt when the DMAChannelOMaxAdrHit status 
txt goes high 

0 s= Do not generate an interrupt when the DMAChannelOMaxAdrHit 
status bit goes high 


OMAChannell IntAdrHitMask 


2 


As per OMAChannelOlntAdrHitMask 


DiMChannehMaxAdrHttMask 


3 


As per DMAChannelOMaxAdrHitMask 



12.9 SCB Implementation 



This section is still a work in progress - the information here should be ignored as it refers to an earlier ver- 
sion of the SCB 
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<lms(_icujrq 



trsbi_tx_6n 



^ usb_lx^dp 



' usb^fx. 



U8b_nc. 



.reset_n 



lsLOp}o_dout^ 



»si_gFfio_e 



7^ 



tsi_tnJio_<lin 

^ isijcu_ifq 
t ^cpr.feset_n 



— _ 


diu_scb_wack 


— 1 — ► 


scbjdiu.wvalid 




scb_diu_rreq 


i w 

— I — » 


4 


dlu_scb_rack 


1 




diu_scb_rva!id 


f 


scb_<feu_wadr * 




scb_diu_radr 






scb^diu.data 






dtu_data 


— 



DMA 
Manager 



scb_dju_wreq 



dma_cp u_data 



dma_cpujcntrt 



dma_scbs_data 



scfc>s_dma_data 



dma_scbs_cntrt 



USB 



usb_scbs.data 



usb^scbs_cntrt 



ISI 



isi_8cb3_data 



'4- 


scbsJsLdata 




4 — 


isLscbs_cntrl 





CPU 

Subsystem 
Interface 



U 



ORAM 



cpu_scfa_jel 



_cpu_rwn 



scD_cpu^ray / 
scb_cpu^Derr ^ 
, cpu_adf ; ^' 



^ cpu_dataout 



CPU 
Block 



SCB 
Switch 
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Characteristics of the data channels: 

USB: Packets should be moved sequentially out of the endpoint FIFOs. The USB is the slowest compo- 
nent in the SCB but its bandwidth is most precious. However both the DMA and ISI can transfer data (50 
and 40 Mbps respectively) much faster than the USB can receive data (12 Mbps peak rate) so no flow con- 
trol problems will occur due to a speed mismatch. If one of the DMA or ISI data sinks becomes blocked or 
inactive then the USB controller will assert baclqjressure (by NAKing packets) when the double buffer for 
the associated endpoint is filled. Other endpoints will remain active in this scenario and the DMA and ISI 
will still be able to transfer data at their peak rates. The worst case scenario is when all endpoints have 
their double buffers fUled (because all the data sinks had been blocked/disabled) and then all data smks 
become avaOable again. In this case the backlog will be fully cleared in 3 USB 64.byte packet times. 

ISI: The ISI can support simultaneous reception and transmission of packets. ISI packets should be trans- 
ferred sequentially in either direction. The ISI is expected to handle the packet header and trailer, if any is 
used for error detection, in both directions i.e. only raw payload data is routed through the SCB m^, 

DMA: The DMA channels are unidirectional but their direction, namely whether they are transferring 
data to or from DRAM, is programmable. Each DMA transaction to DRAM will be 256 bits wide but all 
256 bits are not always valid. When a transfer of less than 256 bits is required the DMA manager pads the 
remaining bits in the 256-bit word with zeroes, in the case of a write to DRAM, or discanis the unnecces- 
saiy bits in the case of a DRAM read. Can we get by with single (256 bits each way or maybe even 256 
bits in all ?) buffering for the DRAM manager ? 
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dma_scbs_data 





scbs_dma_data 






dma_8cbs_cnirf ^ 


— » 



iisb_scbs_data 



usb_scba_cnui 



Is}_scbs.data 



scbsjst^data 



DMA 
i/f 



USB 
i/f 



iSI 
i/f 



dma_dout_fdyJidI1 :0] 





dma_dOLit ^ 






dma_dout_valtd 




dma_cfin_fdy 


<- 


dma_din_ld[1:01 






dma_din 




-4- 


dma_din_valid 





usb_rx_data ^ 



usb_diata_valid 



lsLdata_fdy_ld{5 :0] 





isLrx.data 






IsLixjdata^vaBd 






isLbc_rdy 




4- 


lsLt3edata_id[4:Ol 


— > 




tel.tx_data ^ 






i8).tX-data.vaIkl 





CPU 
Subsystem 
Interface 



Switch 
Logic 



Figure 41. SCB Switch bloclc diagram 



SCB Switch pseudocode: 

const no_data_sinks = 12 

for i = 1 to no_data_siiiks 
if (x <= 2) then 

sinK^data is dma_din 
sink^rdy is drna_din_rdy 
sink_data_vaHd is draa_din^v«lid 
sink_id is dina_din_id 
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else 

fiink.daca is isi_tx_data 
sink_rdy is isi_tx_rdy 
8ink_data_vaiid is isi_tx-.daca_valid 
sinlc_id is isi_tx_data_id 

if (daca_src_reg(i) i- 0) then // Each data sink has an associated data source 

// register. A non-zero value means the sink is enabled 
if ( <data_src_reg(i] & OxFO) == 0x10) then // A USB endpoint is the data source 
if ((usb_ep_rdy[41 == 1) AND <usb_ep_rdy (3 OJ data_src_reg( i] (3 : 0) ) ) then 

// there is data waiting in the EP FIFO 
while ( (usb_data_valid « 1) AND (sink_rdy == 1) AND clocktick) 
sink_data = usb_rx_data 
sink_data_valid = 1 

if <i <= 2> then // The sink is a I»lAChannel 
sink_idtll = 1 
sink.id(0] a i -1 
else // The sink is an ISI channel 
sink^idCS] « 1 
sink_id[4:0] = i -1 
else // There is no data ready- to go 
sinK_data_valid = 0 

elsif (data^rc.reg & OxFO) == 0x20) then // The ISI is the data source 

if (isi_data_rdy_idC3:0] == data_src_regli) (3 :0] ) then // there is data waiting 

// in the ISI receive FIFO for this ISISubld 
while ((isi_rx_data_valid == 1) AND (sinK.rdy «== 1) AND clocktick) 
sink_data = isi_rx^data 
8ink_data.valid = 1 

if (i <s 2) then // The sink is a OHAChannel 
sink_idCl) = 1 
sin)^_idlOJ « i -1 
else // The sink Is ah ISI channel 
sink_id(5) = I 
sink_idC4:0) = i -3 
else // There is no data ready to go 
Eink_data_valid = 0 

elsif <data_src.reg & OxFO) == 0x30) then // The DUA. is the data source 

if (doaa_dout_rdy-id[0) data_src_reg[i ) [01 ) then // there is data waiting 

// in the relevant DHA buffer for this sink 
while ( (dina_dout_valid «= 1) AND (aink^rdy =« 1) AND clocktick) 
sink_data = dma^dout 
sink_data_valid = 1 

if (i <= 2) then // The sink is a DMA channel 
sink_idllJ « 1 
sink_id(0) = i -i 
else // The sink is an ISI channel 
sink_idt51 1 
sink_id(4:0] = i -3 
else // There is no data ready to go 
sink_data.valid » 0 

The above pseudocode has a few shortcomings, particularly if all oixr data buses are not the same size, but 
it shows the basic functionality the switch is supposed to offer. The main loop of the pseudocode (for i = 1 
to no_data_sinIcs) dictates what happens within one timeslot. The timeslots take as long as required to 
complete and loop around endlessly. The msb of the usb_ep_nfyf4:0J, isi_data_rdy_idf5:0J and 
dma_dout_r€fy_id[l:0] signals is used to indicate that data is available in the relevant block. 
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13 General Purpose lO (GPIO) 

13.1 Overview 

The General Purpose lO block (GPIO) is responsible for control and interfacing of GPIO pins to the rest of 
the SoPEC system. It provides easily progranunable control logic to simplify control of GPIO functions. 
In all there are 14 GPIO pins of which certain pins have special functions, their functions are detailed as: 

• 4 Motor control lOs internally pulled down 

• 4 General purpose high drive pulsed lOs capable of driving LEDs. 

• 4 Open drain lOs used for LSS interfaces 

• 2 Normal drive lOs used for the ISI interface in Multi-SoPEC mode 

Each of the pins can be configured in either input or output mode, each pin is independently controlled. A 
programmable de-glitching circuit exists for all input pins. Each input is a schmidt trigger to increase noise 
immunity should the input be used without the de-glitcb circuit. The mapping of the above functions and 
their alternate use in a slave SoPEC to GPIO pins is shown in Table 42 below. 



Table 42. GPIO pin functfonallty 











gpiol3:0] 


Motor controJ pins / general purpose iO 


flpk)[7:4] 


LEO driver pfns / generaJ purpose fO 


flpk)l11:8] 


LSS inteiface pins / general purpose IO 


0p!o[13:12) 


ISI intertace pins / general purpose IO 



1 3.2 Motor control 

The motor control pins can be directly controlled by the CPU or the motor control logic can be used to 
generate the phase pulses for the stepper motors. The controller consists of two central counters from 
which the control pins are derived. The central counters have several registers (see Table 44) used to con- 
figure the cycle periods the phase, the duty cycle, and counter granularity. 

There are two motor master counts (0 and 1) with identical features. The period of the master counters 
arc defined by the AfotorMast€rClkPeriodfJ:OJ andMotorAfasterClkSrc registers i.e. bodi master counters 
are derived from the same Motor MasterClkSrc, The MotorMasterClkSrc defines the timing pulses used by 
the master counters to determine the timing period. The MotorMasterClkSrc can select clock sources of 
l)i5,100^,10ms and pc/Jb timing pulses. 

The MotorMasterClkPenod[l:0] registers are set to the number of timing pulses required before the tim- 
ing period re-staits. Each master counter is set to the relevant MotorMasterCikPeriod value and counts 
down a unit each time a timing pulse is received. 

The master coimters reset to MotorMasterCikPeriod value and count down. Once the value hits zero a new 
value is reloaded from the MotorMasterCikPeriod f 1 :0J registers. This ensures that no master clock glitch 
is generated when changing the clock period. 

Each of the IO pins for the motor controller are derived from the master coimters. Each pin has indepen- 
dent configuration registers. The MotorMasterClkSelect[3:0] registers define which of the two master 
counters to use as the source for each motor control pin. The master counter value is compared with the 
configured MotorCtrlHigh and MotorCtrlLow registers. If the count is equal to MotorCtrlHigk value the 
motor control is set to 1, if the count is equal to MotorCtrlLow value the motor control pin is set to 0, 

This allows the phase and duty cycle of the motor control pins to be varied at pclk granularity. 
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The motor control generators can be paused at the end of a clock period by setting the MotorMasterClock' 
Enable register to zero. This allows the CPU to re-configure the motor controller without causine a alitch 
on the output pins. 



13.3 LED CONTROL 



LED lifetime and brightness can be improved and power consumption reduced by driving the LEDs with a 
pulsed rather than a DC signal. The source clock for each of the LED pins is a 7.8kHz (128|is period) 
clock generated from the Ijis clock pulse from the Timers block. The LEDDutySelect registers are used to 
create a signal with the desired waveform. Unpulsed operation of the LED pins can be achieved by using 
CPU 10 direct control. By default the LED pins are controlled by the LED control logic. 



Master Clock 
LEOOutySelecl =0 
LEDDutySolectsI 
LEDDutySelect =2 
LEDDutySelect a3 
LEDDutySelect =4 
LEDDutySelect s5 
LEDDutySelect s6 
LEDDutySelect ^7 



I r 



Figure 42. Duty Cycle Select 



13.4 LSS INTERFACE VIA GPIO 

In some SoPEC system configurations one or more of me LSS interfeces may not be used. Unused LSS 
intcrfece pins can be reused as general lO pins by configuring the CpuIOCtrl register. When a bit in the 
CpulOCtrl is set the corresponding pin is controUed by the CPU registers, otherwise the pin is controlled 
by the LSS block. By de&uit the LSS controb the GPIO pins H to 8. 

13.5 iSI INTERFACE VIA GPIO 

In Multi-SoPEC mode the SCB block (in particular the ISI sub-block) requires direct access to and from 
the gpio[12] and gpio[13J pins. Control of the ISI interface pins is deteimined by the CpulOCtrl register. 
^yhen a bit in the CpulOCtrl is set the corresponding pin is controlled by the CPU registers, otherwise the 
pin is controlled by the ISI block directly. By default the pins are directly controUed by the ISI block. 
In single SoPEC systems the pins can be re-used by the GPIO. 

13.6 CPU GPIO CONTROL 

The CPU can assume direct control of any (or all) of the lO pins individually. On a per pin basis the CPU 
can turn on direct access to the pin by setting the CpulOCtrl register. Once set the lO pin assumes the 
^'^O'^ specified by the CpufODirection register. When in output mode the value in register CpuIOOut 
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wUl be directly reflected to the output driver. When in input mode the status of the input pin can be read in 
either the direct version or a de-glitched form, by reading CpuIOIn and CpuIOInDegUtch respectively 
When wntingtothe Qw/OOi// register Ae top bits of the register (bits 29 to 16) are used to filter access to 
the lower bits (13 to 0). 

13.7 Programmable de-glitching logic 

Each ID pin can be filtered through a de-glitching logic circuit The circuit can be configured to sample the 
lO pin for a predetermined time before concluding that a pin is in a particular state. The exact sampling 
length IS configurable, but each GPIO pm must use one of two possible configured values (selected by 
DeGlitchSelecty The sampling length is the same for both high and low states. The DeGHtchCount is pro- 
grammed to the number of system time units that a state must be valid for before the state is passed on 
The time units arc selected by DeGUtchClkSel and can be one of liis.lO(His,10ms and pclk pulses. 
For example \f DeGHtchCount is set to 10 and DeGlitchCikSel set to 3, then an input pin (one of gpioflS 
to 0]) must consistently retain its value for 10 system clock cycles (pclk) before the input state will be 
propagated from CpuIOIn to CpuIOInDegUtch. 

13.8 Interrupt generation 

Any of the GPIO pins can generate an interrupt from the raw or deglitchcd version of the input pin. There 
are 14 possible interrupt sources from the GPIO to the intermpt controUer, one interrupt per input pin. The 
InterruptSrcSelect register determines whether the raw input or the deglitchcd version is used as the inter- 
rupt source. 

The interrupt type, masking and priority can be programmed in the interrupt controller. 

13.9 Frequency ANALYSER 

The firequency analyser measures the duration between successive positive edges on an input pin and 
reports the last period measured {FreqAnaLastPeriod) and a running average period (FreqAnaAverageY 
The running average is updated each time a new positive edge is detected and is calculated by 
FreqAnaAverage = ( FreqAnaAverage / 8 ) ♦ 7 + FreqAnaLastP&riod I 8. 

The analyser can be used with any input pin (or its degUtched form), but only one pin at a Hme can be 
selected. The pin is selected by the FreqAnaPinSelect and its degUtchcd form can be selected by 
FreqAnaPinFormSelect. 
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13.10 Implementation 

13.10.1 Definmons of I/O 



Table AZ. I/O def InlUon 









imm&mmsmmmwfm 


Clocks and Resets 


pdk 


1 


In 


System Clock 


P*'st_n 


1 


In 


System reset, synchronous active low 


tim_pulse[2;0] 


3 


In 


Timers block generated timing pulses. 

0 • 1 |xs pulse 

1 • 100 }is pulse 

2 - 10 ms pulse 


CPU Interface 


cpu_addr(7:2) 


6 


In 


CPU address bus. Only 6 bits are required to decode the 
address space tor this block 


cpu_dataout{31 :0J 


32 


In 


Shared write data bus from the CPU 


gpro.cpu_<fata[31 :0] 


32 


Out 


Read data bus to the CPU 


cpu_rwn 


1 


In 


Common read/not-write signal from the CPU 


cpu_gpio_sel 


1 


In 


Block select from the CPU, When cpiJLgp*o_s©ns high both 
cpu_adarantS cpu_dataoutare valid 


gpto.Gpu_rdy 


1 


Out 


Ready signal to the CPU. When gplojcpu_idy\s high it Indi- 
cates the last cycle of the access. F=6r a write cyde this means 
cpu_aataout has been registered by the GPIO block end for a 
read cyde (his means the data on gpiojopujcSata is vaOd. 


gpio_cpu_berf 


1 


Out 


Bus error signal to the CPU indicating an invalid access. 


gpio_cpu_debug_vatid 


1 


Out 


Debug Data valid on gpiOLcpii_data bus. Active high 


cpu_acode[1:0] 


2 


In 


CPU Access Code signals. These decode as follows: 

00 - User program access 

01 • User data access 

10 - Supervisor program access 

1 1 - Supervisor data access 


lO Pins 


gpio_o(13:0J 


14 


Out 


General purpose tO output to iO driver 


QploJ{13K3J 


14 


In 


General purpose IO Input from lO receiver 


gpio_e(13:0] 


14 


Out 


General purpose IO output control. Active high driving 


GPIOtoLSS 


lss^io_do(1 :0] 


2 


In 


LSS bus data output 
Bit 0 • LSS bus 0 
Bit 1 - LSS bus 1 


gpio.lss_di(1.*0] 


2 


Out 


LSS bus data input 
Bit 0 ' LSS bus 0 
Bit 1 - LSS bus 1 


lss_gpio_e(1:0] 


2 


In 


LSS bus data output enat>le. active high 
Bit 0 - LSS bus 0 
Bit 1 - LSS bus 1 


tesjplo_clk(1:0] 


2 


In 


LSS bus dock output 
BItO-LSSbusO 
Bit 1 - LSS bus 1 


GPIO to ISI 
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Table 43. I/O definition 



flpfo Jsi_din[1 :0] 


2 


Out 


Input data from lO receivers to ISI. 


isLgp^o_dout(1,-Oj 


2 


In 


Data output from ISI to 10 drivers 


»si_gpjo_e(1:0} 


2 


tn 


GPJO ISI pins output enable (active high) from ISI Inteitace 


Interrupts 


gpio_icujrq[13:0) 


14 


Out 


GPIO pin interrupts 


Debug. 


debug_data_out[l 6:3] 


14 


in 


Output detnjg data to be muxed on to the GPIO pins 


debog_cntri[16:3J 


14 


In 


Control signal for each GPIO bound debug data line indicating 
whether or not the debug data should be selected k)y the pin 
mux 



13.10^ Configuration registers 

The configuration registers in the GPIO are programmed via the CPU interface. Refer to section 1 1.4.3 on 
page 70 for a description of the protocol and timing diagrams for reading and writing registers in the 
GPIO, Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register 
reads and writes, the lower 2 bits of the CPU address bus are not required to decode the address space for 
the GPIO. When reading a register that is less than 32 bits wide zeros should be returned on the upper 
unused bit(s) of gpio_pcu_data. Table 44 lists the configuration registers in the GPIO block 

Table 44. GPIO Register Definition 




CPU lO Ccnbiol 


0x00 


CpiilOCtrl 


14 


0x0000 


Indicates whether each lO pin is directly control- 
led by the CPU or not 

0 - Default Control 

1 - CPU Control 


0x04 


CpulOUserModeMasIc 


14 


0x0000 


User Mode Access Mask to CPU GPIO control 
register. When 1 user access Is enabled. One 
bit per gpio pin. Enables access to CpulODirec- 
tton, CputOOuU CputOtn and CpulOtnDegfitch 
in user mode If Cpu/OCfr/ allows CPU access. 


0x08 


CpulOSuperModeMask 


14 


0X3FFF 


Supervisor Mode Access Mask to CPU GPIO 
control register. When 1 supervisor access Is 
enatsled. One bit per gpio pin. Enables access to 
CpulOOirection, CpulOOut, CpulOln and Cpu/- 
OinDegfitch in supervisor mode if Cpu/OCM 
allows CPU access. 


OxOC 


CpulOOirection 


14 


0x0000 


Indicates the direction of each lO ^n. when con- 
trolled by the CPU 

0 - Indicates Input Mode 

1 - Indicates Output Mode 
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Table 44. GPiO Register Definition 







MM. 


m 




UX lU 


cpuioout 


30 


0x0000 
_0000 


Value used to drive output pin in CPU direct 
mode. 

bits1 3:0 - Value to drive on output GPIO pins 
tiits 15:14 • Reserved, (Read as zero always) 
bits 29:16 - Wrrte enable mask for bit8l3:0, 0 
enables write, 1 masks the write. (Read as 2ero 
always) 


0x14 


CpulOtn 


14 


Exter- 
nal pin 
value 


Value received on each input ptn regardless of 
mode. Read Only register. 


0x18 


CpulOlnDeglitch 


14 


0x0000 


DeglHched version of CputOtn register. Note 
tftat after reset this register will reflect the exter- 
nal pin values 256 pdk cycles after they have 
stabilized. Read Only register. 


Deglitcti contii 




0x20024 


DeQlitchCountf1:0] 


2x6 


OxFF 


De-fllitch drcuft sample count in DeGiitchakSrv 
selected units for pins gpi<^13:0] 


0X28-2C 


OeGmchClkSrc(1:0] 


2x2 


0x3 


Specifies the unit use of the GPIO deglitch cir- 
cuits: 

0 • 1 (IS pulse 

1 - 100^ pulse 
Z' 10 ms pulse 
3 -PC/Ac 


0x30 


DeGIitchSeJect 


14 


0x000 


Specifies which deglttch count {DeQIitx^Counti 
and unit select {DeGtUcnClkSn^ should be used 
to degOtch each GPIO pin 

0 - Spedfree DeGfitchCountlOJ and DeGUtchClk' 
SrcfO] 

1 - Specifies DeGfitchCountfl] end DeGlftchCfk- 
Sfc[1] 


Motor Control 




0x34 


MotorCtrlUserModeEnabfe 


1 


0x0 


User Mode Access enable to Motor control con- 
figuratk>n registers. When 1 user access Is ena- 
bled. 

Enal>les user access to MotorMasterCikPeriod, 
MototMasterCikSrc, MotorOutySeiect, Motor- 
PhaseSetect, MotorMastetCtockBnabte and 
MotorMasterdkSelQCt registers 


0x38 to0x3C 


MotorMasterC(kP^riod[1 .-Oj 


2x16 


0x0000 


Specifies the motor controller master dock peri- 
ods in Mofo/Maste/C/ASrc selected units 


0x40 


MotorMaslerClkSrc 


2 


0x0 


Specifies the unit use tiy the motor controller 
master dock generator: 

0 - 1 fis pulse 

1 - 100 us pulse 

2 - 10 ms pulse 

3 - pcffc 


0x44 to 0x50 


MotorCtr]High[3:0] 


4x16 


0x0000 


Specifies the tow to high transition point in the 
dock period for each motor control pfn. 


0x54 to 0x60 


MotorCtrlLow[3:0] 


4x16 


OxFFFF 


Specifies the high to low transition point In the 
dock period for each motor control pin. 
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Table 44. GPtO Register Definition 





1 ssi^^tsss^^Mitvi^stisi 
















0x64 to 0x70 


MotorMasterClkSelect[3:0] 


4x1 


0x0 


Specifies which motor masler dock should be 
used as a pin generator source 

0 - Clock derived from MotorMasterCtockPo- 

riod[0] 

1 -crock denved from MotorMastordockPe- 
riod[1] 


0x74 


MotorMasterClockEnable 


2 


0x0 


Enable the motor master dock counter When 1 
count Is enabled 

Bit 0 - Enat)le motor master dock 0 
Bit 1 • Enable motor master dock 1 


LEO control 


0x78 


LEOCtrlUserModeEnable 


4 


0x0 


User Mode Access enable to LEO control con- 
figuratk>n registers. When 1 user access is ena- 
bled. 

One bit per L£DDiiiy$e/ecf select register. 


0x7CtoOx88 


LEDDutySelect(3:0] 


4x3 


0x0 


Specifiea the duty cyde for each LED pIn.See 
Figure 42 for encoding details. The LEDDutySe- 
lac^3;(V registers determine the duty cyde of 
the gf^o[T:4] ;3inB 


Frequency Analyser 


0X8C 


FreqAnaPinSelect 


4 


0x00 


Selects which GPIO input shouki be used for the 
frequency analyses. 


0x90 


FreqAnaPinFbrmSelect 


1 


0x0 


Selects if the frequency analyser shouki use the 
raw Input or the degfitdied form. 

0 - Deglhched form of Input pin 

1 - Raw form of input pin 


0x94 


FreqAnaLastPeriod 


16 


0x0000 


Frequency Analyser last period of selected input 
pin. 


0x98 


FreqAnaAverage 


16 


0x0000 


Requency Analyser average period of selected 
input pin. 


Ox9C 


FreqAnaCountlnc 


20 


0x0000 
0 


Frequency Analyser counter incremem amount 
f=ior each dock cyde no edge is detected on the 
selected input pin the accumlator Is incremented 
by this amount. 


Mtscefianeous 


OxAO 


InterruptSrcSetect 


14 


0x000 


Intemipt source select 1 bit per GPIO pin. 
Determines whether the interrupt source is 
direct form the input pin or the deglltched ver- 
sion 

1 - Input pin direct 
0 - Oeglitched Input pin 


0xA4 


OebugSelect 


6 


0x00 


Debug address select Indteates the address of 
the register fo report on the ^k>__cpu_c/ata bus 
when it Is not otherwvise being used. 


OxAS-OxAC 


MotorMasterCdunt 


2x16 


0x0000 


Motor master dock counter values. 
Bus 0 - Master dock count 0 
Bus 1 - Master dock count 1 
Read Only regteterB 
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i 3.1 0.2. 1 Supervisor and user mode access 

The configuration registers block examines the CPU access type {cpu_acode signal) and determines if the 
access is allowed to that particular register, based on configured user access registers. If an access is not 
allowed the GPIO will issue a bus error by asserting the gpiojcpujierr signal. 

Access to the CpuIODirection, CpuIOOut, CpuIOIn and CpuIOInDeglitch is filtered by the CpuIOUser- 
ModeMask and CpuIOSuperModeMask registers. Each bit masks access to the corresponding bits in the 
CpuIO* registers for each mode, with CpuIOUserModeMask filtering user data mode access and CpuIO- 
SuperModeMask filtering supervisor data mode access. 

The addition of the CpuIOSuperModeMask register helps prevent potential conflicts between user and 
supervisor code read modify write operations. For exanople a conflict could exist if the user code is inter- 
rupted during a read modify write operation by a supervisor ISR which also modifies the CpuIO* registers. 

An attempt to write to a disabled bit in user or supervisor mode will be ignored, and an attempt to read a 
disabled bit returns zero. If there are no user mode enabled bits then access is not allowed in user mode 
and a bus error will result Similarly for supervisor mode. 

When writing to the CpuIOOut register, bits 29 to 16 are used to mask the write to the CpuIOOut[IS:0J, If 
the mask bit is zero the write is active to cpmesponding CpuIOOut pin, otherwise the write to that pin is 
ignored. 

The pseudocode for determining access to the CpuIODirection register is shown below. Similar code could 
be shown for the CpuIOOut^ CpuIOIn and CpuIOInDeglitch registers, 
if (cpu.acode == SUPERVISOR_DATA^MODE ) then 
// supervisor mode 

if (CpurosuperModeMa8k[13 :0] 0 ) then 
^ // access is denied, and bus error 

9Pio_cpu_berr » 1 
elsif (cpu_rwn == 1) then 
// read mode 

gpio.cpu.data (13: 01 = ( CpuIOOut (13 :0) & CpuX0SuperMadeHask(13 : 0) ) 
else 

// vrrite mode, filtered by mask! 

mask(13:01 = - (cpu_dataout (29 : 16 J ) & CpuIOSuperModeMask ( 13 :0) 

CpuIOOut (13:0] = (( cpu_dataout(13:0) & inask[13:01 ) | 
( CpuIOOut [13: 01 & -(mask(13:0]3))) 
elsif <cpu_acode OS USERJUWTAJKODE) then 
// user datamode 

if (Ck>uIOUserModeMaskC13:0) 0 ) then 

// access is denied, and bus error 

gpio.cpujtorr s 1 
elsif (cpu^rwn ==1) then 

// read mode, filtered by mask' 

9pio.cpu.data » ( QpulOOut [ 13 : 0 ] & CpuI0UserModeMask(13:0}) 
else 

// write node« filtered by mask 

inask(13:03 = - (cpu.dataout (29 : 16] ) & CpuIOUserKodeKask(13 :0] 

CpuIOOut (13:0] = ({ cpu.dataout(13:0] 6 mask(13:0] > | 
< CpuIOOut (13:0] & -(ma8k(13:0])))) 

else 

// access is denied, bus error 
gpio_cpu_berr = 1 
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Table 45 details the access modes allowed for registers in the GPIO block. In supervisor mode all registers 
are accessible. In user mode forbidden accesses will result in a bus error {gpio_q>u_herr asserted). 



Table 45. GPIO supervisor and user access modes 













0x00 


CpulOCtH 


Supervisor data mode only 




0x04 


CpulOUserModeMask 


Supervisor data mode only 




0x08 


CpuIOSuperWodeMask 


Supervisor data mode only 




OxOC 


CpulODirection 


CpulOUserModeMask and CpulOSuperModeMask filtered 




0x10 


CpulOOut 


CpuIOUserModeMask and CpulOSuperModeMask tittered 




0x14 


CpulOln 


CpuIOUserModeMask and CpulOSuperModeMask filtered 




0x18 


CpulOJnDeoUtch 


CpuIOUserModeMask and CpulOSuperModeMask filtered 




0x20-024 


DeQntchCcuntCIK)} 


Supervisor data mode only 




0x28-20- 


OeQlftchCtkSrc(1.*0] 


Supervisor data mode only 




0x30 


OeGCtchSelect 


Supervisor data mode only 


1 


0x34 


MotorCtrtUserfy^odeEnabld 


Supendsor data mode only 


1 


0x38to0x3C 


MotofMasterClkPeriod[1 :0] 


MotofCtrlUserModeEnable enatiled 


1 


0x40 


MotorMasteratcSrc 


MotOfCtriUserModeEnable enabled 




0x44 to 0x50 


MotorCtrlHigh[3:0] 


MotorCtrfUserModeEnable enabled 




0x54 to 0x60 


MotorCtrlLow[3:0] 


MotOfCtrlUserModeEnat)le enabled 


1 


0x64 to 0x70 


MQtorMaste/ClkSeleci(3:0] 


MotorClrlUserMode Enable enabled 


1 


0x74 


MotorMasterCtockEnable 


MotOfCtriUserModeEnable enabled 




0X78 


LEDCtriUserModeEnaWe 


Supervisor data mode only 




0x60 


LEODutySelectfO] \ 


LEDCtriUserModeEnable[0] enabled 




0x84 


LEDOutySelect(1} 


L£DOtriUserModeEnable[1] enabled 




0x74 


LE0DutySelect(2] 


LEDCtrfUserMode8nable[2] enabled 




0x88 


LEDDutySelecl(3J 


LEDariUserModeEnable[3] enabled 




Ox8C 


ReqAnaPinSefect 


Supervisor data mode only 


1 


0x90 


FreqAnaPinFormSelect 


Supervisor data mode only 




0x94 


FreqAnaLastPerfod 


Supervisor data toode only 




0x98 


ReqAnaAverage 


Supervisor data mode only 




OxSC 


ReqAnaCoumlnc 


Supervisor data mode only 


1 


OxAO 


imemjptSrcSelect 


Supervisor data mode only 


1 


0xA4 


DebugSelect 


Supervisor data mode only 


I 


OxAaOxAC 


MqtorMasterCount 


Supervisor data mode only 
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13.10.3 GPIO partition 



A 



GPIO 
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CPU 
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Configuration registers 
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Sis 



20 



/ U 
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Frequency 
Anatyser 



i 
s 

i 



/2x2 



Input 
De-glttch 



t Y 



> 4x1{ 



4B 



y'4 



Motor 
Control 



LED 

Pulse Gen 



1 
1 



lO Control 



B 1 



2 / 



LSS 



fSI 



Rgure43. GPIO partition 



13.10.4 lOcontrof 

The lO control block connects the lO pin drivers to internal signalling based on configured setup registers 
and debug control signals. 

The motor, LED pins, ISI and LSS control logic: 
// motor and led pins 
for (i=0; i<l4 ; ( 

if (debug_cntrl[i] == 1) then 

gpio_e(i] a I 

gpio.oCi] = debug_data.out ( i ] 
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cpu_io_in(i] = gpio_i(i] 
if (cpu_io„ctrl[iJ == 1) then 

gpio_GCi) = cpu_io_dir [il 

gpio^oCi] = cpu_io_out ti] 

cpu_io_inCi] «= gpio_i[i} 
else 

// default control 

if ( i < 4 ) then // motor control pins 

flpio_e(iJ » 1 

gpio_o[i] = motor^ctrl (i) 

cpu_io_in[il = gpio_itiJ 
elsif ( i < 8 ) then // LED pins 

gpio_e(i] a 1 

gpio_o(i) = ladLctrlCil 

cpu_io_inti] = gpio_iCi} 
elsif (i < 10) then // LSS interface clock pins 

gpio_e(i| 3 1 

gpio.o(i) • lsa_gpio_clkIi-8] 

cpu«io_inCiJ = gpio_ilil 
elsif (i < 12) then // LSS interface data pins 

gpio_eIiJ = lssJgpio_eCi-10] 

gpio_o(iJ = lss.jgpio_do(i-101 

lss_9pio_di(i-10] = gpio_iCi) 
else // I SI interface' pins 

SPio_etiJ isi^.gpio_eCi-12J 

gpio^o[i] « isi_gpio_doutCi-12J 

isi_gpio_dinCi-12} = gpio_itiJ 

) 

13.10.5 LED pulse generator 

The pulse generator logic consists of a 7-bit counter that is incremented on a Ijis pulse from the timers 
block (tim^uIsefOJ), The LED control signal is generated from comparing the coimt value with the con- 
figured duty cycle for the LED (Jed^duty_sel), 

The logic is given by: 

for (i=0 i<4 ;i'»-'i-) ( // for each LED pin 
// period divided into a segments 
period.div8 « cnt(6:4]; 

if (periodLdivS <- led^duty.sel[i] ) then 

led_ctrl[ll = 1 
else 

led.ctrlli) « 0 
//in higher half invert the led control 
if (cnt|6] 1) then 

led_ctrl(il « - led_etrl[ij 

> 

// update the counter every lus pulse 
if (tiiiupulsefO) a.) then 

cnt H-f . J 

1 3. 1 0.6 Motor contral 

The motor controller consists of 2 counters, and 4 phase generator logic blocks, one per motor control pin. 
The counters decrement each time a timing pulse icnt_en) is received. The counters start the configured 
clock period value (motor_mas_clk^>eriod) and decrement to zero. If the counters are enabled (via 
motor_masjclk_enable), the counters will automatically restart at the configured clock period value, oth- 
erwise they will wait until the counters are re-enabled. 
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The timing pulse period is one of pclk, l|is. lOO^is, 1ms depending on the motor^masjclkjsel signal. The 
counters are used to derive the phase and duty cycle of the of each motor control pin. 

// decrement logic 
if (cnt_en 1) Chen 

if ( (raas_cnt == 0) AND (motor_mas_clk_enable == 1>) then 

roas_cnt = inotor_inas_cl)c_period[15 j 0} 
els if ( (mas^cnt -= 0> AND (motorjaas_cl)c_enable 0) ) then 

mas^cnt « 0 
else 

mas_cnt — 
else // hold the value 
maa_cnt = mas^cnt 



^mas_dk_src | 
tim_pulse{0]- " 

timjKiIsePl- 
1- 



motor_ctr1_high 
motor_ctrl_tow 



/ 4X16 y 4X16 



cnt_en 



1^ 



motor_mas.c(K..periotf[0] 
inotor.mas_clK_enable[0] 



motor_inas_ciK_period[ 1 ) y » 
inotor_mas.dk_enable( 1 J 




fnotof_cti1 



r!iotOf__mas_count 



Figure 44. Motor control RTL diagram 

The phase generator block generates the motor control logic based on the selected clock generator 
{motor^mas^clkjsel) the motor control high transition point (motor_ctrl_high) and the motor control low 
transition point (motor_ctrl_low). There are 4 instances one per motor control pin. 

The logic is given by: 

// select the input counter to use 
if {motor_jnas_clk_sel ==1) then 

cotint = inas_cnt(l) 
else 

count « inas_cnt(01 
// Generate the phase and duty cycle 

if ( (motor_ctrl == 1 ) AND (count == motor_ctrl_low) ) then 
motor_ctrl « 0 

eleif ( (motor_ctrl =» 0) AND (count == inotor_ctrl_high) ) then 

inotor_ctrl = 1 
else 

motor^ctrl = motor.ctrl // reznain the same 
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13.10.7 Input deglitch 



The input deglitch logic rejects input states of duraUon less than the configured number of time units 
{deglitch_cnt\ input states of greater duration are reflected on the output cpujojnjdeglitch. The time 
units used (either pc/^ l^s, 100|is. 1ms) by the deglitch circuit is selected by the deglitch_clkjsrc bus. 

There are 2 possible sets of deglitch_cnt and deglitch^clkjsrc that can be used to deglitch the input pins. 
The values used are selected by the deglitch ^el signal. 

Each input pin can be used to generate an interrupt. The interrupt can be generated from the raw input sig- 
nal or a deglitched version of the input The interrupt source is selected by the interrupt^rcjselect signal. 
The counter logic is given by 

if ( cpu_io_iii I = cpu_io_in_delay) then 
cnt e deglitclx^cnt 

output_en s 0 . , 

elsif (cnt == 0 ) then 

cnt « cnt 

output_en e 1 
elsif (cnt_en e= i) then 

cnt — 

output_en = 0 



cpuMio_tn 



t2ni_pulse[0] 
tInuJulseCI) 
tirn_pulse[2]' 
1- 



degOtch.dK-selto] 
degmch.clk_se9i] 
degfitch_CRt[0] 
degHtch.c(Tt(i] 




^ cpu.io.iru.<l0gIIteh 



intAmipC8rc.8el 

Figure 45. Input de-gfiteh RTL diagram 



13.10.8 Frequency Analyser 

The frequency analyzer block monitors a selected input pin (selected by FreqAnaPinSelect and FreqAnaP* 
inFormSei) and detects positive edges. Between successive positive edges detected on the input pin it 
increments a counter by a programmed amount (FreqAnaCoundnc) on each clock cycle. When a positive 
edge is detected the FreqAnaLastPeriod register is updated with the top 16 bits of the counter and the 
counter is reset The frequency analyser also maintains a running average of the FreqAnaLastPeriod regis- 
ter. Each time a positve edge is detected on the input pin the FreqAnaAverage register is updated with the 
new calculated FreqAnaLastPeriod, The average is calculated as 7/8 the current value plus 1/8 of the new 
value. Both tfie FreqAnaLastPeriod and FreqAnaAverage registers can be written to by the CPU. 
The pseudocode is given by 

if ((pin 1) AND pin_delay ==0 ) ) then // positive edge detected 
fre<j_an«_lastperiod = count [31: 16] 

freq_«na_aveirage = f req„ana_average - £rea.ana.average/8 + £req_ana_lastperiod/8 
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count = 0 
else 

count = count + freq_ana_count_inc 
// implement the configuration register write 
if (wr_lost_en IJ then 

freq_ana_lastperiod = wr_data 
elflif {wr_average_en bs= i > then 

f re(L_ana_average = wr.data 



cpu.lo_ln_degiJtch(1 3«| 
cpu^»o.ln[l"3:0] 

fraq_ana^in_8el(3:0] 




wr_data(1S.X>) ^ 
iivr_iast_en 



freq.ana_oounUnc — 7^ 



Analyser Logic 



freq.anaJa5t_perfod[tS:0] 



frQqjana.averaoe[15:0] 



^ ^ count 



32 



n 



Figuro 46. Frequency analyser RTL diagram 
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14 Interrupt Controller Unit (ICU) 

The interrupt controller accepts up to N input interrupt sources, determines their priority, arbitrates based 
on the highest priority and generates an interrupt request to the CPU. The ICU complies with the interrupt 
acknowledge protocol of the CPU. Once the CPU accepts an internipt (i.e. processing of its service routine 
begins) the interrupt controller will assert the next arbitrated interrupt if one is pending. 

Each mtemipt source has a fixed vector number N, and an associated configuration register, JntRe^flsTJ. 
The format of the IntRegP^ register is shown in Table 46 below. 



Table 46. fntReg[N] register format 







msmsmmmmmmmm 


Priority 


7:0 


Interrupt prioiity 


Typo 


9:8 


Determines the triggering conditions for ttie interrupt 

00 • Po^tive edge 
10- Negative edge 

01 - Positive level 
11 - Negative level 


Mask 


10 


Mask bit. 

1 - Interrupts from this source are enabled, 
0 - Interrupts from this source are d[sat>ted. 

Note that there may be additional masks in operation at the source of the 
Interrupt 


Reserved 


31:11 1 


Reserved. Write as 0. 



Once an interrupt is received the interrupt controller determines the priority and maps the programmed pri- 
ority to the available CPU priority levels, and then issues an interrupt to the CPU. The mapping of pro- 
grammed priority to native interrupt levels will be fixed, and is dependent on CPU choice. 

For example for the LEON CPU Acre are 15 levels available which would allow 16 sub-priorities per level 
(as each level is in itself a priority). In this case priorities 255-240 map to level 1 5, 240-224 to level 14 and 
so on, with priorities 15-0 conesponding to level 0. Level 0 is no intemq>t Level 15 is the highest interrupt 
level. 



14.1 INTERRUPT PREEMPTION 

There are two types of pre-emption possible: standard LEON pre-cnoption and SoPEC pending pre-emp- 
tion. With standard LEON pre-emprion an interrupt can only be pre-empted by an interrupt with a higher 
priority level If an interrupt with the same priority level (1 to 15) as the interrupt being serviced becomes 
pending then it is not acknowledged until the current service routine has completed. The SoPEC pending 
pre-emption is an extension of the standard LEON scheme which is made possible by the programmable 
priority levels in the lntReg[N] register. 

Interrupts with a higher sub-priority will pre-empt interrupts with a lower sub-priority but the same prior- 
ity level mapping, if the interrupt has not been acknowledged by the CPU i.e. it is still pending. If an inter- 
rupt with a higher sub-priority arrives while an interrupt with a lower sub-priority at the same level is 
being serviced then it will not be serviced until the lower sub-priority service routine has completed. 

TTius when pre-emption is required, interrupts should be programmed to different levels as interrupt prior- 
ities of the same level have no guaranteed servicing order. 

The interrupt is directly acknowledged by the CPU and the ICU automatically clears the pending bit of 
acknowledged interrupts. 
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All interrupt controller registers are only accessible in supervisor data mode. If the user code wishes to 
mask an interrupt it must request this from the supervisor and the supervisor software will resolve user 
access levels. 



14.2 Interrupt sources 



The mapping of interrupt sources to interrupt vectors (and therefore IntReg[N] registers) is shown in 
Table 47 below: Please refer to the appropriate section of this specification for more details of the interrupt 
sources. 

Table 47. Interrupt sources vector tabfe 









0 


Timers 


WatchDog Tinner Update request 


1 


TJmera 


Generic Timer 1 interrupt 


2 


Timers 


Generic Timer 2 interrupt 


3 


Tuners 


Generic Tirrwr 3 interrupt 


4-17 


GPfO 


GPIO general interrupt, source pin 0 <13 


18 


MMU 


MMU Security violation 


19 


SCB 


USB fnternjpt 


20 


SCB 


ISI interrupt 


21 


SCB 


DMA interrupt 


22 


LSS 


LSS interrupt. LSS interface 0 intemipt request 


23 


LSS 


LSS interrupt LSS interface 1 interrupt request 


24 


PCU 


PEP Sut>*8ystem Interrupt* CDU finished band 


25 


PCU 


PEP Sut>-8y$tem Intemipt* CDU error 


26 


PCU 


PEP Sul)-system Interrupt- LBD finished band 


27 


PCU 


PEP Sut>>8ystem Interrupt- TE finished band 


28 


PCU 


PEP Sub^system Interrupt- PCU finished band 


29 


PCU 


PEP Sub-system Interrupt- PCU invalid address Intemipt 


30 


PCU 


PEP Sub-system Interrupt- PHI Buffer underrun 


31 


PCU 


PEP Sub-^tem Interrupt- PHI Page finished 


32 


PCU 


PEP Sub-system Intemjpt- PHI Print ready 


33 


PHI 


PEP Sub-system intemipt- PHI Une Sync Interrupt 
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14.3 Implementation 



14.3.1 DefinHions of I/O 

Table 48. Interrupt Controller Unit I/O definition 



Ctock8 and Resets 


pdk 


1 


In 


System Clock 


prst_n 


1 


In 


System reset, synchronous active low 


CPU Interface 


cpu.adrt7:2) 


6 


In 


CPU address bus. Only 6 bits are required to decode the 
address space for the ICU tAock 


cpu_dataout(3 1 :0) 


32 


In 


Shared write data bus from the CPU 


lcu_cpu_data(3 1 :0] 


32 


Out 


Read data bus to the CPU 


cpu_rwn 


1 


In 


Common read/not-write signal from the CPU 


cpu^icu^set 


1 


In 


Block select from the CPU. When cpu^fcu^sel is high both 
cpicadrand cp</_daeaouf are valid 


icu.^u_fdy 


1 


Out 


Ready signal to the CPU. When ftCLLqpuL/dy Is high it indi- 
cates the last cycle of the access. For a write oyde this 
means cpu^dataout has been registered by the ICU block 
and for a read cycle this means the data on icujcpujdata is 
valid. 


lcu_cpu^Deve([3:0] 


4 


Out 


Indicates the priority level of the current active Interrupt. 


cpu_iack 


1 


Out 


Interrupt request acknowfedge from the LEON core. 


cpu_lcuJIavel(3K)J 


4 


In 


Interrupt acknowledged level from the LEON core 


lcujcpu_berr 


1 


Out 


Bus enor signal to the CPU indkxiting an invalid access. 


cpu_acode(1:0] 


2 


In 


CPU Access Code signals. These decode as foDows: 

00 - User program access 

01 - User data access 

10 - Supervisor program access 

1 1 - Supervisor data access 


lcu.cpu.debug_valld 




Out 


Debug Data valid on teujcpu^xSata bus. Acth« high 


Interrupts 


tim_lcu_wd«.lrq 




In 


Watchdog timer interrupt signal from the Timers btock 


timjcujrqf2:0) 




In 


Generic timer interrupt signals from the Timers block 


Qpk>_icujrq(13:0| 


14 


In 


GPIO pfn Interrupts 


mmujtujrq 




fn 


Memory Managemem Unit intenrupt 


usb_icu_lrq 




In 


USB interrupt from the SOB 


Isijcujrq 




In 


ISI interrupt from the SCB 


dma_ictj_irq 




(n 


DMA Interrupt from the SCB 


Iss.lcuJrqfIrO] 




In 


LSS interface interrupt request 


cdu_finishedband 




In 


Finished band interrupt request from the CDU 


cdujcujpegerrof 




In 


JPEG error interrupt from the CDU 


lbd_finishedband 




In 


Rnished band interrupt request from the LSD 


te^finishedband 




fn 


Rnished b£md interrupt request from the T6 


pcu_fin(shedband 




In 


Finished band Interrupt request from the PCU 


pcujcu.addressjnvaltd 




In 


Invaiki address interrupt request from the PCU 
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Table 48. Interrupt Controller Unit I/O definftlon 











phijcu_undemin 


1 


fn 


Buffer underrun interrupt request from the PHI 


phijcu_page_fini8h 


1 


In 


Page finished Intenupt request from the PHI 


phL*cu_print_rdy 


1 


rn 


Print ready interrupt request from the PHI 


phLicuJInesyncJnt 


1 


In 


Une sync Intemjpt request from the PHI 



14.3.2 Configuration registers 

The configuration registers in the ICU are programmed via the CPU interface. Refer to section 1 1.4 on 
page 69 for a description of the protocol and timing diagrams for reading and writing registers in the ICU. 
Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register reads and 
writes, the lower 2 bits of the CPU address bus are not required to decode the address space for the ICU. 
When reading a register that is less than 32 bits wide zeros should be returned on the i^pcr unused bit(s) 
of icu^cu^datcL Table 49 lists the configuration registers in the ICU block. 

The ICU block will only allow supervisor data mode accesses (i,e. epujacode[l:OJ « 
SUPERVISOR_DATA), All other accesses will result in icu^cpujberr being asserted 



Table 49. f CU Register Map 









Ha 






1 












0x00*0x84 


tntRegI33:0] 


34x11 


0x000 


Interrupt vector configuration register 


1 


0x88-0x8C 


lntClear[1:0] 


2x32 


0x0000 
_0000 


Interrupt pending clear register If written with a one 
ft dears oorresporuling interrupt 
IntCleaftO] - Interrupts sources 31 to 0 
fntClear[1] - tntenupts source 33 to 32 


1 


0x90-0x94 


IntPendtngtlX)] 


2x32 


0x0000 
JOOOO 


Interrupt pending register. (Read Only) 
lntPendtng[0] - tntemipts sources 31 to 0 
IntPendingll j - Interrupts source 33 to 32 


1 


0x98 


IntSource 


6 


0x00 


(ndicates the interrupt source of the current winning 
actWe interrupt. (Read Only) 




0x9C 


DekxigSefect 


6 


0x00 


Debug address select Indicates the address of the 
register to report on the icujcpujdAta bus when it 
Is not otherwise being used. 
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14.3.3 ICU partition 



i:0]- 
i:0 ■ 



tlmjcu^wd.ln 
tini_teu_lrql2: " 
flplo_lcuJrq[13: 

usb_teu_irq 
teLtauJrq 
dma_lcujrq 
lssjcujrq[l:0j 
cdujfinishedbaixi 
odujcujpegerror 
Ibd.finishedband 
te^finishedbend 
pcu.finishodband 
pcu_lcu_CM3dress_lnvalld 
phLlcu_page_flnish 
phijcu_prin<_fdy 
phljcu^underrun 
pW_lcu_Ilnesync_Int 



x34 



Intenxipt 
detect 



y 34x12 ' 



Int active^ ^ 
IntorioritvW ^ 



s 



cpuJnLctear 



tnt src 



Configuration 
registers 



i 



CPU 



interrupt 
arbiter 



34 5 



Interrupt 
oontroUer 



^^4 



.^4 



Figure 47. ICU partition 



14.3.4 Interrupt detect 



The ICU contains multiple instances of the intermpt detect block, one per interrupt source. The interrupt 
detect block examines the interrupt source signal, and determines whether it should generate request pend- 
ing {int^end) based on the configured interrupt type and the interrupt source conditions. If the interrupt is 
not masked the interrupt will be reflected to the interrupt arbiter via the int^active signal. Once an interrupt 
is pending it remains pending until the interrupt is accepted by the CPU or it is level sensitive and gets 
removed. Masking a pending interrupt has the effect of removing the interrupt from arbitration but the 
interrupt will still remain pending. 

When the CPU accepts the interrupt (using the nonnal ISR mechanism), the interrupt controller automati- 
cally generates an interrupt clear for that interrupt source (cpu^intjolear). Alternatively if the interrupt is 
masked, the CPU can determine pending interrupts by polling the IntPending registers. Any active pending 
interrupts can be cleared by the CPU without using an ISR via die IntClear registers. 

The logic is shown below: 
mask « int_conflg[101 

type = int_config[9:81 

int^priority = int_conf ig(7 :0] 

int_pend = aast_int_pend // the last pending interrupt 

// update the pending FF * 
if ((int_clear == 1 )OR (cpu_int_cleare=l) ) then 

in expend « 0 
// test for intenrupt condition 

if ((type == NEG_LEVEL ) AND (int^srrc == 0) then 
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int_pend = 1 
elsif (<type == POS_LEVEL) AND (int_src == 1) 
int_pend = I 

elsif ((type == NEG_EDGE ) AND (int_src == 1) AND (last_int_src == 0)) 
int_pend = 1 

elsif ((type 3« POS_EDGE ) AND (int_src == 0) AND (last_int_src == 1)) 

int.j>end « 1 
else 

int_pend = aast_int_src // stay the same as before 
// mask the pending bit 
if (mask o= i) then 

int_active = int^end 
else 

int_active 3 0 
// assign the registers 
last_int_8rc = int_src 
last.int_pend = int:_pend 

1 4.3.5 Interrupt arbiter 

The interrupt arbiter logic arbitrates a winning interrupt request from multiple pending requests based on 
configured priority. It generates the interrupt to the CPU by setting icu^cpujlevel to a non-zero value. TTie 
priority of the interrupt is reflected by the value assigned to icu_<^ujievel, the higher the value the higher 
the priority, 15 being the highest. The current wiiming interrupt and is reported to the CPU via the IntSn: 
register generated in the interrupt arbiter block. 

// arbitrate based on priority 
if (arb_enable == 1 ) then 

// arbitrate with the current winner 
win_int_priority = 0 
int^src = 0 

int^request a 0 

for (i80;i<34;i4-«-) { 

if ( int_acti ve I i 1 =»= 1) then { 

if (int_priorityti] > win_int j>riority ) then 
wia^int_priority = int_^riority (ij 
int.src = i 

int^request m 1 

> 

} 

> - 

// assign the CPU Interrupt level 
i«t_ilevel * int_pr iori ty I in t_8rc J [7:4) 
) 

14.3.6 Interrupt controlfer 

The interrupt controller is responsible for generating the interrupt to the CPU, accepting the interrupt 
acknowledge from the CPU and clearing the interrupt source pending bit 
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The exact procedure is CPU dependent, but examples are given for the LEON processor. See section 1 1.9 
on page 98 for a complete description of the interrupt handling procedure. 



Reset 



int fwwest=0 



c 



ait).enable = 1 



lOUfiQUfiS^ 



IntPend 



icu_( 



.cpu_ilevel ^n1_Uevel 
.enables 1 



Machine remains In same state by defouN 
Ad outputs are zero unless otherwise slated 

State Description: 

Reset : Normal reset state 

IntPend: Interrupt pending, waiting for CPU adcnowledge 

IntClear Interrupt dear, dear the pending for the 
current interrupt vector 



enu leu iteMBi=ieii «pt» 



IntClear 



arb_< 



cpuJn^cieaipnt.src)Bl 



.enabl6 = o 



Figure 48. Interrupt controller state diagram 

After reset the interrupt controller remains in the Reset state untO the interrupt arbiter indicates that there is 
an active interrupt pending (int^equest equal 1). The state machine goes to the IntPend state and signals to 
the CPU that an interrupt is pending. The machine will remain in the IntPend state until the intermpt is 
acknowledged by the CPU or the pending intemipt condition is removed. 

When the intenupt is acknowledged the state machine goes to the IntClear state to clear the pending bit of 
the inteirupt source. 

On completion the state machine returns to the Reset state and again waits for the next pending interrupt 
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15 Timers Block (TIM) 



The Timers block contains general purpose timers, a watchdog timer and timing pulse generator for use in 
other sections of SoPEC. 



1 5.1 Watchdog timer 



The watchdog timer is a 32 bit counter value which counts down each time a timing pulse is received. The 
period of the timing pulse is selected by the WatchDogUnitSel register. The value at any time can be read 
from the WatchDogTimer register and the counter can be reset by writing a non-zero value to the register. 
Should the counter reach I, a system wide reset will be triggered as if the reset came from a hardware pin. 

The watchdog timer can be polled by the CPU and reset each time it gets close to 1 , or alternatively a 
threshold {WatchDoglntThres) can be set to trigger an internet for the watchdog timer to be serviced by 
the CPU. This interrupt can be effectively masked by setting the threshold to zero. The watchdog timer can 
be disabled, without causing a reset, by writing zero to tiie WatchDogTimer register. 



1 5.2 Timing pulse generator 



The timing block contains a timing pulse generator clocked by the system clock, used to generate' timing 
pulses of l|jis, 100|xs and 10ms. Each pulse is of one system clock duration and is active high, with the 
pulse period accurate to the system clock frequency. 

The timing pulse generator also contains a 64-bit free running counters that can be read or reset by access- 
ing the FreeRunCount registei: 



15.3 Generic timers 



SoPEC contains 3 programmable generic timing counters, for use by the CPU to time the system. The tim- 
ers are progranmied to a paiticular value and coimt down each time a timing pulse is received. If a particu- 
lar timer decrements to 0, then an interrupt is generated. The counter can be programmed to automatically 
restart the count, or wait until re-programmed by the CPU, At any time the status of the coimter can be 
read from GenCntValue, or can be reset by writing to GenCntValue register. The auto-restart is activated 
by setting the GenCncAuto register, when activated the counter restarts at GenCntStartValue. A counter 
can be stopped or started at any time, without affecting the contents of the GenCntValue register, by writ- 
ing a 1 or 0 to &e relev«it GenCntEnable register. 
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1 5.4 Implementation 

1 5.4.1 Definitions of I/O 



Table 50. Timers block I/O definition 





mm 


mm 




Clocks and Resets 


pdk 


1 


In 


System Clock 


prst.o 


1 


In 


System reset, synchronous active low 


tim^ulse[2:0] 


3 


Out 


Timers block generated timing pulses, each one pclk wide 

0- 1)is pulse 

1 - 1CX> lis pulse 

2 - 10ms pulse 


CPU Interface 


cpu_adr(62] 


5 


In 


CPU address bus. Only 5 bits are required to decode the 
address space for the ICU block 


cpu_dataout[31 :0] 


32 


In 


Shared write data bus from the CPU 


ttnrucpu_data[31 :0] 


32 


Out 


Read data bus to the CPU 


cpu_fwn 


1 


In 


Common read/not-wrtte signal from the CPU 


cpu_tiin_sel 


1 


In 


Block select fiom the CPU. When cpt(_tHn_se/is high both 
cpu_adrar\(S cpujdataout m valid 


tim_cpu_rcly 


1 


Out 


Ready signal to the CPU, When tfmLcpii_/cfy Is high It Indi- 
cates the last cyde of the access. Fbr a write cycle this 
means qpt/_d!ataocir has been registered by the TIM bk>ck 
and for a read cycle this means the data on tmjcpujiiata is 
valid. 


tini_cpu_berr 


1 


Out 


Bus enor signal to the CPU indteating an invalid access. 


cpu.acod6(1K}] 


2 


In 


CPU Access Code signals. These decode as follows: 

00 • User program access 

01 - User data access 

10 - Supervisor program access 
11- Supervisor data access 


tini_cpu.debug_valkf 


1 


Out 


Debug Data valkJ on OmjcpujUata bus. Active high 


Miscellaneous 


tim_kxi_wdjfq 


1 


Out 


Watchdog timer interrupt signal to the ICU bfock 


tim_k:ujrq[2:0] 


3 


Out 


Generic tinker interrupt signals to the ICU t)lock 


1 tifn_cpr_reset_n 


1 


Out 


Watch dog timer system reset. 
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15.4.2 Timers sub-block partition 



CPU 





oputim.sel 


7 — r 




cpu.dataout 


► 




tiin_cpu_«ly 




4- 


tim_cpu.(tata 




4- 


cpu_rwn 






CPU acode 


► 




tfm_cpu_berr 






ttm CDu debuQ valid 







lre©_nm_cnt 




lree_fun_data 




free_run_wen 




free, atn adr 


► 



o 
S 

B. 

1 

<s 



Timing pufse 
generator 



wring tim ttirftfi 



wdog unit ael 



wdog_wen 



wdoctim data 



WdOQ_tfTFl_Cnt 



iwn tlm gn 



oon tim auto 



-wn.tfnit n* 



pen t&Ti_j 



Qeo_tim_cnt 



gen_tim,cnt_sl_value ^ 



tini^tse[2:0] 



Watchdog 
timer 



Generfc 
timers 



tifn_cpr_rese i_n 



"Hm.lcu.irqiarO] 



Figure 49. Timers sub-block partition diagram 



15.4.3 Watchdog timer 

The watchdog timer counts down from pre-programmed value, and generates a system wide reset when 
equal to one. When the counter passes a pre-programmed threshold (yfdogjdm^thres) value an intemqjt is 
generated {timjcu^wd^irq) requesting the CPU to update the counter Setting the counter to zero disables 
the watchdog reset In siipervisor mode the watchdog counter can be written to or read from at any time, in 
user mode access is denied. Any accesses in user mode will generate a bus error. 

wdogjunltjsel- 

tinupulsefO] 
tim_{nitse(l) 
ilm_pulsa[Z] 
1 



wdog_wen 
wdoa.ilm_data 




^ tiin_lcu_wdjrq 
^ MmjBprjTBSBX^n 
^ wdog_tim_cnt 



Figure 50. Watchdog timer RTL diagram 



The counter logic is given by 

1£ <ftfdog_wen == 1) then 

wdo^_tiitucnt « wdoa_tin\_data 
elsi£ ( wdoo_t iirL.cn t 0> then 

wdog_tin\_cnt « wdog._tiiiv_cnt 
elsif ( cnt^en == 1 ) then 



// load new data 
// count disabled 
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wdog_t indent — 
else 

wdog_tim_cnt = wdog_t indent 
The timer decode logic is 

if (( wdog^tinucnt == v«aog_tiin_thres) AND <wdog.t indent 

tiiiL.icuJwd_irq b 1 
else 

tinuicu^wd-irq « 0 
// reset generator logic 
if (vfdog.tinucnt 1) then 

tinu.cpr_resGt_n = .0 
else 

tiitucpr_reset_n = 1 



0 ) ) then 



1 5.4.4 Generic timers 



The generic timers block consists of 3 identical counters. A timer is set to a pre-configured value {GenCnt- 
StartValue) and counts down once per selected timing pulse (gen_unit_se!). The timer can be enabled or 
disabled at any time (gen_timjen)^ when disabled the counter is stopped but not cleared. The timer can be 
set to automatically restait (genjtim_auto) after it hits zero. In supervisor mode a timer can be written to or 
read from at any time, in user mode access is determined by the GenCntUserModeEnable register settings. 
Oen_uniL.8el- 

t}m_pii]$e(0] 
ttm.j)ufse(1] 
tinupulse[2] 
1 

gervJtfm_cnC8t.value 
ger^wen 
• genjtfmjdata 
gen_tini_en 
gondii n\^auto 




tini_icu_lr<j 



^ gon_tinrucnt 



Figure 51. Generic timer RTL diagram 

The counter logic is given by 

if (gen_wen 1) then 

•gen_tixn_cnt = gen_tin^_data 
elsif <( cnt__en ««s 1 }AND (gen_tim_en == 1 ) > then 

if ( gen^tinucnt == 0) then // counter may need re-starting 
if (gen^tin\_auto =e l) then 

gen_tlnL-cnt = gen_tinucnt_st_value 
else 

geix_tiiiL.cnt « gen_tinL.cnt 

else 

gen_ t i»\_cn t - - 

else 

gen_tim_cnc = gen_tijiL.cnt 

The decode logic is 

if (gen_tiro_cnt == 1> then 

tinv_icu_irq - 1 
else 

tiin_icu^irq = 0 
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15.4.5 Timing pulse generator 

The timing pulse generator contains a general free running 64-bit timer and 3 timing pulse generators pro- 
ducing timing pulses of one cycle duration with a period of l|is» I00|is and 1ms. In supervisor mode the 
free running timer register can be written to or read from at any time, in user mode access is denied. The 
status of each of the l{is, lOO^s and 1ms timer can be read by accessing the HmerPulseStatus registers. 
Any accesses in user mode will result in a bus error. The status of each of the l^s, 100|xs and Ims timer 
can be read by accessing the JlmerPulseStatus register in supervisor mode. 



Free Run Timer 



free_run_wen 



irefl.run_data — y ^ 




freo_run_cnt 



lus Timer 



Decrement 
Logic lus 



100US Tim 3r 



pulse_1us ' 



Decrement 
Logic lOOus 



pmse.lOOus • 



10ms TImi! 



Decrement 
Logic 10ms 



> Compare 



puteo^lus ^ tlm_pul8e(0] 



^ Compare 



piilse.ioous 

^ — ttafuiu!se(1) 



Compare 



putsa_10ms , 



» 1inupiiI$Q|2] 



y ► puise.timer.status 



t{m_puIS6{2:0>- 

Figure SZ Pulse generator RTL diagram 



15,4.5.i Free Run Timer 

The increment logic block increments the timer count on each clock cycle. The counter wraps around to 
zero and continues incrementing if overflow occurs. When the timing register (FreeRunCount) is written 
to, the configuration registers block will set the free^run_wen high for a clock cycle and the value on 
free_runjdata will become the new count value, for the 32 bits selected by the free^run^adr signal. If 
Jree_run_adr is 1 the higher 32 bits of the counter will be written to, otherwise the lower 32 bits are writ- 
ten to. It is the responsibility of software to handle these writes in a sensible manner. 

The increment logic is given by 

if <f ree_run_wen == 1) then 
if ( f ree_run_adr »« 1) then 

free_run_cnt(63:32] = f ree_rim_data 
else 

free_run^cnt(31:01 « f ree_ruA_data 

else 
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free_r\in_cnt ♦+ 



15.4,5.2 Pulse Timers 



The pulse timer logic generates timing pulses of 1 clock cycle length and period of 1^, lOOjxs and Ims. 

The logic for the l\xs timer is given by: 

// lus generator 

if (pulse_lus_cnt 0 ) then 

pulse_lus_cnt = 159 

pulse.lus = 1 
else 

pulae_lu5_cnt — 
pulse^lus B 0 

The logic for 1 OOjiS timer is given by: 
// lOOus generator 

if ( (pulse_100us_cnt == 0 ) AND (pulse.lus «« D) then 

pulse_100us_cnt = 99 

puIse^lOOus = 1 

elsif (pulse_lus == 1) then 

pulse_100us_cnt — 

pulse_100us = 0 

else 

pulse_100us_cnt — 
pulse_100us = 0 

The logic for the 1 0ms timer is given by: ' 
// lOms generator 

i£ ( (pulse.lOms.cnt == 0 > AND (pul8e_100us «= 1) ) then 

pulse_10as.cnt - 99 

pulse.lOms « 1 

elsiC (pulse.lOOus =^ 1) then 

puIse^lOms.cnt — 

pulse.lOms tx 0 
else 

pulse_10ms_cnt — 
pulse^lOms = 0 



The configuration registers in the TIM are programmed via the CPU interface. Refer to section 11. 4.3 on 
page 70.for a description of the protocol and timing diagrams for reading and writing registers in the TIM. 
Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register reads and 
writes, the lower 2 bits of the CPU address bus are not required to decode the address space for the TIM. 



1 5.4.6 Configuration registers 
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When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of tim^cujdata. Table 51 lists the configuration registers in the TIM block . 



Table 51, Timers Register Map 













QxOO 


WatohOogUn'itSel 


3 


0x0 


Spectftes the units used for the watchdog 

timer: 

0 - 1 jis pulse 

1 • 100 ^s pulse 
2- 10 ms pulse 


0x04 


WatchOogTlmer 


32 


OxFFFF 
_FFFF 


Specifies the number of units to count before 
watchdog timer triggers. 


0x08 


WatchDoglntThres 


32 


0x0000 
_0000 


Specifies the threshold value befow which the 
watchdog timer issues an interrupt 


OxOC-OxlO 


ReeRunCount(1 X)] 


2x32 


0x0000 
_0000 


Direct access to the free running counter reg- 
ister. 

Bus 0 - Access to bits 31-0 
Bus 1 - Access to bits 63-32 


0x14 10 0x1 C 


GenCntStartVaIue[2.^] 


3x32 


0x0000 
_0000 


Generic timer counter start value, number of 

units to count b6fore event 


0x20 to 0x28 


GenCntValue(2.'0] 


3x32 


0x0000 
_0000 


Direct access to generic timer counter regis- 
ters 


UXZO H>UX34 


GenCntUnitSel[2K>] 


3x2 


0x0 


Generic counter unit select. Selects the timing 
units used with coaesponding counter: 

0 - 1 |is pulse 

1 - 1 00 ^s pulse 

2- 10 ms pufse 

3- pc«r 


0x38 to 0x40 


QenCntAutD[2:0] 


3x1 


0x0 


Generic counter auto re-start select When 
high timer automatically restarts, othenvfse 
timer stops. 


0x44to0x4C 


GenCntEnable[2:0l 


3x1 


0x0 


Generic counter enable. 

0 ' Counter disabled 

1 - Counter enabled 


OxSO 


GenCntUserModeEnable 


3 


0x0 


User Mode Access enat)le to generic timer 
configuration register. When 1 user access is 

enabled. 

Bit 0 - Generic timer 0 
Bit 1 - Generic timer 1 
Bit 2 - Generic timer 2 


0x54 


DebugSelect 


6 


0x00 


Debug address select. Indicates the address 
of the register to report on the tiin_cpu_data 
bus when it is not otherwise being used. 


Read Only Registers 


0x58 


PulseTImerStatus 


24 


0x00 


Current pulse tinker values, and pulses 

6:0 - 1 us timer count 

7 -1 us pulse 

14:8 - lOOus timer count 

15 - lOOus pulse 

22:16- 10ms timer count 

23 - 10 ms pulse 
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15.4.6. i Supervisor and user mode access 

The configuration registers block examines the CPU access type {cpu^acode signal) and determines if the 
access is allowed to that particular register, based on configured user access registers. If an access is not 
allowed the block will issue a bus error by asserting the tim^cpujberr signal. 

The timers block is fully accessible in supervisor data mode, all registers can written to and read from. In 
user mode access is denied to all registers in the block except for the generic timer configuration registers 
that are granted user data access. User data access for a generic timer is granted by setting corresponding 
bit in the GenCntUserModeEnable register. This can only be changed in supervisor data mode. If a partic- 
ular timer is granted user data access then all registers for configuring that timer will be accessible. For 
example if timer 0 is granted user data access the GenCntStartValuefO], GenCntUmtSelfO], GenCn- 
tAutofO], GenCntEnabiefOJ and GenCntValuefOJ registers can all be written to and read from without any 
restriction. 

Attempts to access a user data mode disabled timer configuration register will result in a bus error. 

Table 52 details the access modes allowed for registers in the TIM block. In si^ervisor data mode all reg- 
isters are accessable. All forbidden accesses will result in a bus error (tint^cpujberr asserted). 



Table 52. TIM supervisor and user access modes 





IBMifii 




0x00 


WatchOogUnltSel 


Supervisor data mode only 


0x04 


WatchDogTimef 


Supervisor data mode only 


0x08 


Watch DogtnlThres 


St^rvfsor data mode only 


OxOC-OxlO 


FreeRunCounl 


Supervisor data mode only 


0x14 


GenCntStartVafue(OJ 


GenCntUserModeEnabTelO] 


0x18 


GenCntStartValue[1] 


GenCntUserModeEhable[1] 


OxIC 


GenCntStartValue[2] 


GenCntUserModeEnai3le[2] 


0x20 


GenCntValuelO] 


GenCntUserModeEnable(0] 


0x24 


GenCntValue[1) 


GenCntU8erModeEnab(e[l ] 


0x28 


GenCntVaIue[2} 


GenCntUserModeEhable[2] 


0x2C 


GenCntUntlSel[0] 


GenCntUserModeEriabfe[01 


0x30 


GenCntUnitSeI(1] 


GenCntUserModeEnatte[1 ) 


0x34 


GenCntUnitSelig 


G enCntUserModeEnable[2] 


0x38 


GenCntAtito{0] 


GenCntUserModeEruiUe(0] 


0x3C 


GenCntAuto{1] 


GenCntUserModeEnable(1] 


0x40 


GenCntAuto(2} 


G enCntUderModeEnabJe[2] 


0x44 


GenCntEnabie[0] 


GenCntUserModeEnabie[0] 


0x48 


GenCntEnabie[1] 


GenCntU$erModeEnable[1 ] 


0x4C 


GenCntEnable[2] 


GenCntUserModeEnabte[2j 


0x50 


GenCntUserModeEnable 


Supervisor data mode only 


0x54 


DebugSelect 


Supervisor data mode only 


0x58 


PulseHmerStatus 


Supervisor data mode only 
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16 Clocking, Power and Reset (CPR) 

The CPR block provides all of the clock, power enable and reset signals to the SoPEC device. 

16.1 POWERDOWN MODES 

The CPR block is capable of powering down certain sections of the SoPEC device. When a section is pow- 
ered down (i.c. put in sleep mode) no state is retained, the CPU must re-initialize the section before it can 
be used again. The exact powerdown mechanism is undefined and is technology dependent. 

For the purpose of powerdown the SoPEC device is divided into sections: 



Table 53. Powerdown sectioning 







Print Engine Pipeline Subsystem 


CDU 


CFU 








SFU 




TE 




TFU 




HCU 




DNC 




owu 




LLU 




PHI 


CPU-DRAM (Section 1) 


ORAM 




CPU/MMU 




DIU 




TIM 




ROM 




LSS interface 


Comma Subsystem (Section 2) 


USB 




ISl 




DMA <^ri 




GPIO 




PSS 




tcu 



16.1.1 Sleep mode 

Each section can be put into sleep mode by setting the corresponding bit in the SleepModeEnable register. 
To re-enable the section the sleep mode bit needs to be cleared and then the section should be reset by 
writing to the relevant bit in the ResetSection register. Each block within the section should then be re-con- 
figured by the CPU. 

I If the CPU system is put into sleep mode, the SoPEC device will remain in sleep mode until a system level 

reset is initiated from the reset pin, or a wakeup reset by the SCB block as a result of activity on either the 
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USB or rsi bus. If all sections are put into sleep mode, then only a system level reset initiated by the reset 
pin will re-activate the SoPEC device. 

Like all software resets in SoPEC the ResetSection register is active-low i.e. a 0 should be written to each 
bit position requiring a reset. The ResetSection register is self-reseting. 

16.2 Reset SOURCE 

The SoPEC device can be reset by a number of sources. When a reset from an internal source is intiated 
the reset source register (ResetSrc) stores the reset source value. This register can then be used by the CPU 
to determine the type of boot sequence required. 

1 6.3 Clock relationship 

The crystal osdllator excites a 32MHz crystal through the xtalin and xtalout pins. The 32MH2 output is 
used by the PLL to derive the master VCD frequency of 960MH2. The master clock is then divided to pro- 
duce 320MH2 clock (clk32G), I6OMH2 clock {clkl6G), 106MHz clock {dklOS) and 48MHz {clk48) clock 
sources. 

The phase reliationship of each clock from the PLL will be defined. The relationship of internal clocks 
clk320, clkl06, clk48 and clkl60 to xtalin will be undefined. The clock tree generation should create inser- 
tion delays so as to compensate for the phase difference of the clocks leaving the PLL. At the output of the 
clock block, the skew between each/7c/^ domain (pclk^ection[3:0} and jclk) should be within skew toler- 
ances of their respective domains (defined as less than the hold time of a D-type flip flop). 

The skew between doclk and phiclk should also be less than the skew tolerances of their respective 
domains. 

The usbclk is derived from the PLL output and has no relationship with the other clocks in the system and 
is considered asynchronous. 
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There is no skew requirement between the pclk doniains and the doclk and phiclk domains, they are con- 
sidered essentially asynchronous to each other. 



1.04ns 



PLL Master Clock 



fiiiiMMUinnnjiiimuiR^^ 



clk320 



doctk 



Clk160 



jdk 



c(k106 



phidk 



cll(320PtJ.phaMshin 



^ — H docflc insertion delay 



r, dkl 60 PLL phaM shift 

I 
I 

t: 



J — L 



1 pciK/Jcfc Insertion delay 



}^ dkl 06 PLL phase shift 



1 r 



1 r 



J — LJ — L 



T I — L 



1 



t < H pWdiclnsertloo delay 

Figure 53. SoPEC clock relationship 



16,4 Implementation 
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16.4.1 Definitions of I/O 

Table 54. CPR I/O definition 









Clocks and Resets 


xtalin 


1 


In 


Crystal input, direct from lO pin. 


xtalout 




Out 


Crystal output, direct to lO pin. 


pclk_section[2:0] 




Out 


System clocks Ibr each section 


phidk 




Out 


Printhead interface dock (doclk/3) for the PHI block 


doclk 




Out 


Data out dock (2x pdk) for the PHI lilock 


jclk 




Out 


Gated versfon of system dock used to clock the JPEG decoder 
core in the COU 


usbdk 




Out 


USB dock at 3 times the crystal input frequency, nominally at 46 
Mhz 


jcrk.enable 




In. 


Gating signal forjdk. 


reset_n 




In 


Reset signal from the msec pin 


usb_cpr_res€Un 




In 


Reset signal from the USB block 


isi_cpf_reset_n 




In 


Reset signal from the IS! block 


tini_cpr_reset_n 




In 


Reset signal from watch dog timer. 


P rBt.n_sectlon[2:0] 




Out 


System resets for each section^ synchronous active k>w 


phirst_n 




Out 


Reset for PHI block, synchronous to phidk 


dofBt^n 




Out 


Reset for PHI block, synchronous to doctk 






Out 


Reset for JPEG decoder core in CDU bfock. synchronous to jdk 


usbrst.n 




Out 


Reset for the USB bfock. synchror>ous to usbdk 


Test Input 


test.dk 




In 


Test dock direct from extemal pin. for use in production test (scan 
test) 


test_enal)Ie 




In 


Test enat9(e. Direct from extemal pin. When high production test . 
mode is enat^led. 


CPU Inteiface 


cpu_adr[3:2I 


2 


In 


CPU acUress bus. Only 2 bits are required to decode the address 
space for the CPR bfock 


cpu_dataout[31:0] 


32 


In 


Shared write data Ikis from the CPU 


cpr.cpu.data(31 :0] 


32 


Out 


Read data bus to the CPU 


cpu.mvn 


1 


In 


Common readAiot-wrlte signal from the CPU. 


cpu_cpr_6el 


1 


In 


Bfock select from the CPU. When cpu^cpr^sel is high kioth 
qpir.adr and cpu.dataoi/r are valid 


cpr_cpu_fdy 


1 


Out 


Ready signal to the CPU. When <^_cpi/_/tiy is high it indicates 
the last cyde of the access. For a write cyds this means 
cpu^dataout has been registered by the block artd for a read cyde 
this mearts the data on cprjcfyujdata is valid. 


cpr_cpu_berr 


1 


Out 


Bus error signal to the CPU indicating an invalkl access. 


cpu.aoodetl :0] 


2 


In 


CPU Access Code signals. These decode as folfows: 

00 • User program access 

01 • User data access 

10- Supervisor program access 
1 1 - Supervisor data access 


cpr_cpu_debug_valid 


1 


Out 


Debug Data valid on cpr_cpu^data bus. Active high 
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Table 54. CPR I/O definition 



1 




3iS 




MisceJIaneous 




pwr_sleep_mode[2.*0] 




1 Out 1 sreep mode section select 



16.4.2 Configuration registers 

The configuration registers in the CPR are programmed via the CPU interface. Refer to section 11.4 on 
page 69 for a description of the protocol and timing diagrams for reading and writing registexs in the CPR. 
Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register reads and 
writes, the lower 2 bits of the CPU address bus are not required to decode the address space for the CPR. 
When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of cpr_pcu_data. Table 55 lists the configuration registers in the CPR block. 

The CPR block will only allow supervisor data mode accesses (i-c. cpu_acode[l:0] = 
SUPERVISORJMTA ). All other accesses will result in epr^cpujyerr being asserted . 



Table 55. CPR Register Map 













0x00 


SleepModeEnabte 


3 


0x0 


Sleep Mode enable, when Ngh a section of logic 
has is powercfown. Each bit controls a section 


0x04 . 


ResetSre 


4 


0x0* 


Reset Source register, indicating the source of 

the last reset 

BH 0 • External Reset 

Bit 1 - USB wakeup reset 

Bit 2 • ISI wakeup reset 

Bit 3 • Watchdog timer reset 


0x08 


ResetSection 


3 


0x7 


Active-low synchronous reset for each section, 
self-resetting. 


OxOC 


OebugSelect 


6 


0x00 


Debug address select, indicates the address of 
the register to report on the cpr^cpujdata bus 
when it is not otherwise being used. 


PLL Control (Asynchronous reset registers) 


0x10 


PLLTuneBits 


10 


0x23E 


PIX tuning bits 


0x14 


PLLRangeA 


4 


OxF 


PIXOLTT A frequency selector (defaults to 
600Mh2to125CMhz) 


0x18 


PLLRangeB 


3 


0x7 


PLLOUT B frequency selector (defaults to 
600Mhz1o1250Mhz} 


0x1 C 


PLLMufUpUer 


5 


0x25 


PLL mumpGer selector. deCautts to rsfclkx 20 



a. Reset value depends on reset source. External reset shown. 
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1 6.4.3 CPR Sub-block partition 



xtalin ^ 
xtalout • 
(esuenabfo - 



Clock Genemtor 



Crystal 
Osdildtor 



PLL 



Jcfleenabto 



pwr.sleep_niod»4- 



usb_cpf_resoU« - 
lsj_cpf_reset.n - 



dkloe 



dk320 



Ctk48 



dKieo 



Gate Enable 
Logic 



/'3 



gate.doni 



tasuenable • 



Reset 
Logic 



Configuiation registers 
AAA — 1 — 1 — A A 



32 

>■ 
p 

i 



^32 



g 



§ 



§ 



8" 
1^ 




dodh ^ 

reset.donXOl 



pWcIl 
reset,dom(l] ^ 



pcJK^sectft>n(OI ^ 

reset^docTi(3] 



pclK.sectiQn(1>— ^ 
reset_<ioinf4] ^ 



pdK.6ectk}nC2)— ^ 
reset domfS] 



rt,doni[61 



CPU 



Figure 54. CPR block partition 
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S3 



16.4.4 Sync reset 



The reset synchronizer retimes an asynchronous reset signal to the clock domain that it resets. The circuit 
prevents the inactive edge of reset occurring when the clock is rising 



pdkl 
reset.dom 

prst_n 



1 r 



synchfpnizgi 



reset^dom 




pist_n 



Figure 55. Reset synchronfzer logic 



1 6.4.5 Reset generator logic 

The reset generator logic is used to determine which clock domains should be reset, based on configured 
reset values {reset^ection_n), the external reset (reset^n), watchdog timer reset (tim_cpr_reset_n) and 
resets from the SCB block iisi_cpr_reset^^ usb_cpr,jieset_n). The reset direct from the 10 pin (reset^n) is 
synchronized and de-glitched before feeding the reset logic. 

Resets from the SCB block reset everything except its own section (section 2), this allows data to be stored 
in the PSS block for use after a SCB poweivp initiated reset 



Table 56. Reset domains 







reseLdom[0} 


doclk domain 


reset_<lom(l] 


phidk domain 


reset_dom[2] 


usbcCk domain 


reset^dompj 


Section 0 pdk domain 


reset_dOfn[4] 


Section 1 pcfk domain 


reset„doml5] 


Sectton 2 pclk domain 


resetjdom(6] 


jdk domain 



The logic is given by 

if (reset^n == 0> then 

resec^doin(6:0] = 0x00 // reset everything 

reset_srct3:0] = 0x01 
els if (usb_cpr_reset_n 0) then 

reset_doml6:0J = 0x20 // all except coimns domain 

reset_src (3:03 » 0x02 
elsif (isi_cpr_reset_n == 0) then 

reset_doniC6:0} = 0x20 // all except coznms domain 

reset^srcO :0] * 0x04 
elsif (tin\_cpr_reset_n 0) then 

reset_doin[6:0] = 0x00 // reset everything 

reset_src(3 :01 = 0x08 



Doc: SoPEC_hardware_design 
Version: 2.3 " 



S3 Proprietary Document 



22 Nov 2002 
Page 173 




SoPEC : Hardware Design 



else 

// propagate resets from reset section register 

reset_dom[5:0] = 0x3 F 

if (reset_section_n[0) 0) then 

reset_domt31 = 0 
if (reset_sectioA_n[ll 0) then 

re3et_doraf4] = 0 
if (reset_section_nl2) == 0) then 

reset.dom[5] = 0 



The gate enable logic is a combinational logic block used to generate gating signals for each of SoPECs 
clock domains. The gate enable (gatejdomain) is generated based on the configured sleep_mode_en and 
the internally generated Jclk_enable signal. 

The logic is given by 

// clock gating for sleep xnodes 
gate_dom[5:3) = 0x7 // default to on 
for (i=0 ;i < 3 ; i++) { 

if (sIeepjRiode_ehti] . == l) chen 
gate_dom(l-i-3 ] = 0 
pwr_sleep.xiiode(i] X 

) 

// jclk and remaining 
gate_dom[2:0) » 0x7 
gate_dom(61 = jclK.enable> 



The clock gate logic is used to safely gate clocks without generating any glitches on the gated clock. When 
the enable is high the clock is active otherwise the clock is gated. 



1 6.4.6 Gate enable logic 



16.4.7 Clock gate logic 



gate.dom 



gatejdom.rellmed' 



gate_dock 




Oate_dom_retinfied 



Figure 56. Clock gate logic diagram 
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16.4.8 Clock generator Logic 

The clock generator block contains the PLL, crystal oscillator, clock dividers and associated control and 
. test logic. The PLL VCO frequency is at 960Mh2 locked to a 32 Mhz refclk generated by the crystal oscil- 
lator In test mode the xtaiin signal can be driven directly by the test clock generator, the test clock will be 
reflected on the refclk signal to the PLL. 



xtaiin- 



xtalout 



Crystal 
OscUJator 



refdk 



pll_range_a 
pU_range_b 
pILmuttplier 
pILtune 



prst_n 




► cJk320 
►clk160 
»>clk106 



Figure 57, PLL and Clock divider logic 



16.4.8.1 dock divider A 



The clock divider A block generate the 320Mh2. 160Mhz and I06Mhz clocks from the input 320Mh2 
clock (pll^outb) generated by the PLL. The divider flips flops are asynchronously reset by the prst^n sig- 
nal. The divders are enabled only when the PLL has acquired lock as indicated by the plijodc signal. 



16.4,8.2 Cioci^ divider B 



The clock divider B block generate the 48Mh2 clock from the input 96Mhz clock (pii^outa) generated by 
the PLL. The divider flips flops are asynchrously reset by the prst_n signal. The divders are enabled- only 
when the PLL has acquired lock as indicated by the pUJoclc signal. 
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1 7 ROM Block 



17.1 Overview 

The ROM block interfaces to the CPU bus and contains the SoPEC boot code. The ROM block consists of 
the CPU bus interface, the ROM macro and the ChipID macro. The current ROM size is 16 KBytes imple- 
mented as a 4096 x32 macro. Access to the ROM is not cached because the CPU enjoys fast (no more than 
one cycle slower than a cache access), unarbitiated access to the ROM. 

Each SoPEC device is required to have a unique ChipID which is set by blowing fuses at manufacture. 
IBM*s 300mm ECID macro is to be used to implement the ChipID and this offers 112-bits of laser fuses. 
The exact number of fuse bits to be used for the ChipID will be determined later but all bits are made 
available to the CPU. The ECID macro allows all 1 12 bits to be read out in parallel and the ROM block 
will make all 1 12 bits available in the FuseChipID/NJ registers which arc readable by the CPU in supervi- 
sor mode only. 

1 7.2 Boot operation 

The are two boot scenarios for the SoPEC device namely after power-on and after being awoken from 
sleep mode. When the device is in sleep mode it is hoped that power will actually be removed from the 
DRAM, CPU and most other peripherals and so the program code will need to be freshly downloaded each 
time the device wakes up from sleep mode. In order to reduce the wakeup boot time (and hence the per- 
ceived print latency) certain data items are stored in the PSS block (see secdon 18). These data items 
include the SHA-1 hash digest expected for the program(s) to be downloaded, the master/slave SoPEC id 
and some configuration parameters (cxurently TBD). All of these data items are stored in the PSS by the 
CPU prior to entering sleep mode. The SHA-1 value stored in the PSS is calculated by the CPU by 
decrypting the signature of the downloaded program using the appropriate public key stored in ROM. This 
compute intensive decryption only needs to take place once as part of the power-on boot sequence - subse- 
quent wakeup boot sequences will simply use the resulting SHA-1 digest stored in the PSS. Note that the 
digest only needs to be stored in the PSS before entering sleep mode and the PSS can be used for tempo- 
rary storage of any data at all other times. 

The CPU is expected to be in supervisor mode for the entire boot sequence described by the pseudocode 
below. Note that the boot sequence has not been finalised but is expected to be close to the following: 

if (Resetsrc == 1) then // Reset: was a power-on reset 

conf igure_aopec // need to configure peris (tISB, ISI, DMA. ICQ etc.) 
// OthejTwise reset was a wakeup reset so peris etc. were alrea<^ configured 
PAUSE: wait until IrqSenaphore !■ 0 //i.e. wait until an interrupt has been serviced 
if ( IrqSeinaphore OHAChanOHsg) then 

parsejttsgdJMAChanOHsgPtr) // this routine will parse the message and take any 

// necessary action e.g. programming the OMAChannell registers 
elsif (irqseinaphore == DMAChanlMsg) then // program has been downloaded 

CalculatedHash = gen_shal(ProgramLocn. ProgramSize) 

if ( Reset Src == 1) then 

ExpectedHash = sig_decrypt ( Programs ig) 

else 

ExpectedHash = PSSHash 
if (ExpectedHash == CalculatedHash) then 

jmptPrgramLocn) // transfer control to the downloaded program 
else 

send_host„msg ( "Program Authentication Failed') 
goto PAUSE: 

elsif ( IrqSemaphore timeout) then // nothing has happened 
if (ResetSrc 1) then 
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sleep_inode() // put SoPEC into sleep mode to be woken up by USB/ISI activity 
else //we were woken up but nothing happened 
res e t_sopec ( PowerOnRese t ) 

else 

goto PAUSE 

The boot code places no restrictions on the activity of any programs dowTiloaded and authenticated by it 
other than those imposed by the configuration of the MMU i.e. the principal ftmction of the boot code is to 
authenticate that any programs downloaded by it are from a trusted source. It is the responsibility of the 
downloaded program to ensure that any code it downloads is also authenticated and that the system 
remains secure. The downloaded program code is also responsible for setting the SoPEC ISIId (see section 
12.7 for a description of the ISIId) in a multi -SoPEC system. See the "SoPEC Security Overview" docu- 
ment [9] for more details of the SoPEC security features. 

17.3 Implementation 

17-3.1 Definitions of I/O 



Table 57. ROM Block I/O 











docks ancf Resets 




1 


In 


Global reset. Synchronous to polk, active low. 


pdk 


1 


in 


Global dock 


CPU Interface 


cpu.adr{15:2] 


14 


In 


CPU address t3us. Only 14 bits are required to decode the address 
space for this block. 


roni_cpM_clata(3l :0] 


32 


Out 


Read data bus to the CPU 


cpu^fwn 


1 


In 


Common read/not-write signal from the CPU 


cpu_ac6cre(1:0j 


2 


In 


CPU Access Code signals. These decode as Mows: 

00 - User program access 

01 - User data access 

10 - Supervisor program access 

1 1 " Supervisor data access 


cpu_nom_8el 


1 


In 


Block select from the CPU. When cpu^rom^sef 'is high cpu.adr is 
valid ~ 


rom_cpu_rdy 


1 


Out 


Ready signal to the CPU. When mm_<^_ntyis high it Indicates 
the last cycle of the access. For a read cycle this means the data on 
rom^cpujcSata is valkl. 


rom__cpu_berr 


1 


Out 


ROM bus error signal to the CPU indicating an Invaiki access. 



17.3.2 Configuration registers 

The ROM block will only allow read accesses to the FuseChipID registers with supervisor data space per 
missions (i.e. cpu_acode[l:0] = 11). All other accesses of the FuseChipTD registers will result ii 
rom_cpuJ)err being asserted. The ROM block allows aU read accesses to the ROM itself (i,e supervisor o 
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13 



user, data or program accesses). The CPU subsystem bus slave interface is described in more detail in sec- 
tion 9.4.3. 



Table 58. ROM BJock Register Map 




17.3.3 Sub-Block Partition 



IBM offer two variants of their ROM macros; A high perfonnance version (ROMHD) and a low power 
version (ROMLD). It is likely that the low power version will be used unless some implementation issue 
requires the high performance version. Both versions offer the same bit density. The sub-block partition 
diagram below does not include the clocking and test signals for the ROM or ECID macros. The CPU sub- 
system bus interface is described in more detail in section 1 1.4.3. 



ROM Macro 
4096x32 



rom_adr 



rom.data 



IBM 300mm ECID macro 



r 
I 
I 

FUS6001 

— i; — 

N 
H 
N 
H 

FUSC111 



I 



fuse_data 



fuse_reg_adr 



CPU Bus 
Internee 



4- 


—9^ 












*- 


— ► 




^ 



cpu_adr 

ronrL.cpu_data 

cpu.rom^sel 

cpu_iwn 

rom_cpu_rdy 

cpu.acode 

rom_cpu_berr 



Figure 58. Sub-block partition of the ROM block 

17.3.4 Sub-block signal definition 

Table 59. ROM Block Internal signals 





Rep 






Clocks and Resets 


pr8t_n 


^ 1 


Global reset. Synchronous to pcCk. active low. 
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Table 59. ROM Block Internal signals 









pdk 




\ Global dock 


Internal Signals 


rom_adr(t1:01 


12 


ROM address bus 


ronn_sel 


1 


Select signal to the ROM macro instructing it to access the k>cat{on 
at rvm_adr 


rom_oe 


1 


Output enable signal to the ROM block 


roni_data(31:0] 


32 


Data bus from the ROM nr^cro to the CPU bus Interlace 


rom.dvancl 


1 


Signal from the ROM macro indicating that the data on mm_data is 
valkl for the address on rom_adr 


fuse_data{31:0l 


32 


Data from the FuseChipiOfN] register addressed by fuse_fBg^adr 


fuse_refl_adr(1 :0] 


2 


Indicates wt>k:h of the FuseCNpIO rej^sters is being addressed 
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18 Power Safe Storage (PSS) Block 




The PSS block provides 1 28 bytes of storage space that will maintain its state when the rest of the SoPEC 
device is in sleep mode. The PSS is expected to be used primarily for the storage of decrypted signatures 
associated with downloaded programmed code but it can also be used to store any information that needs 
to survive sleep mode (e.g. configuration details). Note that the signature digest only needs to be stored in 
the PSS before entering sleep mode and the PSS can be used for temporary storage of any data at all other 
times. 

Prior to entering sleep mode the CPU should store all of the information it will need on exiting sleep mode 
in the PSS. On emerging from sleep mode the boot code in ROM will read the ResetSrc register in the CPR 
block to detcraiine which reset source caused the waket^. The reset source information indicates whether 
or not the PSS contains valid stored data, and the PSS data determines the type of boot sequence to exe- 
cute. If for any reason a full power-on boot sequence should be performed (e.g. the printer driver has been 
updated) then this is simply achieved by initiating a full software reset. 



The storage area of the PSS block will be implemented as a 128-byte register array. The array is located 
from PSS_base through to PSSJ>ase+0x7F in the address map. The PSS block will only allow read or 
write accesses with supervisor data space permissions (i.e. cpu_acode[l:0] = 1 1). All other accesses will 
result in pss_cpujberr being asserted. The CPU subsystem bus slave interface is described in more detail 
in section 11.4.3. 




18.1 



Overview 



18.2 



Implementation 
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18.2.1 Definitions of I/O 



Table 60. PSS Block VO 











Clocks and Resets 


prst_n 


1 


In 


Global reset. Synchronous to pdk, active k>w. 


pdk 


1 


In 


Global dock 


CPU Interface 


cpu_adr(6:2) 


5 


In 


CPU address bus. Only 5 bits are required to decode the address 
space for this blodc. 


cpu_dataout[31:0] 


32 


In 


Shared write data bus froni the CPU 


pss_cpu_jdata(31 :0] 


32 


Out 


Read data bus to the CPU 


cpu^nivn 


1 


In 


Common read/hot-write signal from the CPU 


qni_acode(1.-0] 


2 


In 


CPU Access Code signals. These decode as lolhnvs: 

00 - User program access 

01 - User data access 

10 - Supervisor program access 

1 1 • Supervisor data access 




1 


In 


Block select from the CPU, When qpcLpss_se/i3 high both cpu^adr 
and cpt4_dEa<ao(/f are valid 


pss_cpu_rrfy 


1 


Out 


Ready signal to the CPU. When pss_cpu_/dy is high it indicates the 
last cyde of the access. For a read cyde this means the data on 

pss^cpu_data Is valid. 


pss_cpu_berr 


1 


Out 


PSS txis error signal to the CPU lndk;atlng an invalid access. 
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19 Low Speed Serial Interface (LSS) 



19.1 Overview 



The Low Speed Serial Interface (LSS) provides a mechanism for the internal SoPEC CPU to communicate 
with external QA chips via two independent LSS buses. The LSS communicates through the GPIO block 
to the QA chips. This allows the QA chip pins to be reused in multi-SoPEC environments. The LSS Mas- 
ter system-level interface is illustrated in Figure 59. Note that multiple QA chips are allowed on each LSS 
bus. 

CPU sub-system bus 



CPU 



LSS Master 
X 



SoPEC 

LSSbusO 



GPIO 



LSS busi 



QAChipO 



QAChipl 



QA Chip 2 



QAChlpa 



Figure 59. LSS master systenvlevel interface 



19.2 QA COMMUNICATION 

The SoPEC data interface to the QA Chips is a low speed, 2 pin, synchronous serial bus. Data is trans- 
fened to the QA chips via the lss_dam pin synchronously with tiie to.c/Jt pin. When the Iss^clk is high the 
data on iss_data is deemed to be valid. Only the LSS master in SoPEC can drive the iss_clk pin, this pin is 
an input only to the QA chips. The LSS block must be able to interface with an open-collector pull-up bus. 
This means that when the LSS block should transmit a logical zero it will drive 0 on the bus. but when it 
I should transmit a logical 1 it will leave high-impedance on the bus (i.e. it doesn't drive the bus). If all the 

agents on the LSS bus adhere to this protocol then there will be no issues with bus contention. 

I The LSS block controls ail communication to and from the QA chips. The LSS block is the bus master in 

all cases. The LSS block interprets a command register set by the SoPEC CPU, initiates transactions to the 
QA chip in question and optionally accepts return data. Any return information is presented through the 
configuration registers to the SoPEC CPU. The LSS block indicates to the CPU the completion of a com- 

I mand or the occurrence of an error via an interrupt. 

19.2.1 Start and stop conditions 

All transmissions on the LSS bus are initiated by the LSS master issuing a START condition and termi- 
nated by the LSS master issuing a STOP condition. START and STOP conditions are always generated by 
the LSS master. As illustrated in Figure 60, a START condition corresponds to a high to low transition on 
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lss_data while lss_clk is high. A STOP condition corresponds to a low to high transition on Iss^data while 
i[$5L.c/^ishigh. 



Iss.dau 



Iss.cUc 




Figure 60. START and STOP conditions 



19.2^ Data transfer 



Data is transferred on the LSS bus via a byte orientated protocol. Bytes are transmitted serially. Each byte 
is sent most significant bit (MSB) first through to least significant bit (LSB) last. One clock pulse is gener- 
ated for each data bit transferred Each byte must be followed by an acknowledge bit 

The data on the Issjiata must be stable during the HIGH period of the lss_plk clock. Data may only 
change when Iss^clk is low. A transmitter outputs data after the falling edge o f lss_clk and a receiver inputs 
the data at the rising edge of Iss^clk, This data is only considered as a valid data bit at the next lss_clk fall- 
ing edge provided a START or STOP is not detected in the period before the next Issjclk falling edge. All 
clock pulses are generated by the LSS block. The transmitter releases the Iss^data line (high) during the 
acknowledge clock pulse (ninth clock pulse). The receiver must pull down the lss_ddta line during the 
acknowledge clock pulse so that it remains stable low during the HIGH period of this clock pulse. 

Data transfers follow the format shown in Figure 61. The first byte sent by the LSS master after a START 
condition is a primary id byte, where bits 7-2 form a 6-bit primary id (0 is a global id and will address all 
QA Chips on a particular LSS bus), bit 1 is an even parity bit for the primary id, and bit 0 forms the read/ 
write sense. Bit 0 is high if the following command is a read to the primary id given or low for a write 
conunand to that id. An acknowledge is generated by the QA chip(s) corresponding to the given id (if such 
a chip exists) by driving the lss_data line low synchronous with the LSS master generated ninth IssjclL 



ISS.data "j ^j f ^aal -\^h\tQ\ Ack bitO ]^ Acfc fe'tO \ Nack \ \ f \ 

■ « " * ,1 

* * 4 r 



START IDbyii![7:l) R/W ACK 
conditioa 



DATA AC3C DATA 

Rgure 61. LSS transfer of 2 data bytes 



ACK STOP 



1 9.2.3 Write procedure 



The protocol for a write access to a QA Chip over the LSS bus is illustrated in Figure 63 below. The LSS 
master in SoPEC initiates the transaction by generating a START condition on the LSS bus. It then trans- 
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mils the primary id byte with a 0 in bit 0 to indicate that the following command is a write to the primary 
id. An acknowledge is generated by the QA chip corresponding to the given primary id. The LSS master 
wiU clock out M data bytes with the slave QA Chip acknowledging each successful byte written. Once the 
slave QA chip has acknowledged the M*'' data byte the LSS master issM&s a STOP condition to complete 
the transfer. The QA chip gathers the M data bytes together and interprets them as a command. See QA 
Chip Interface Specification for more details on the format of the conmiands used to communicate with 
the QA chip[8]. Note that the QA chip is free to.'not acknowledge any byte transmitted. The LSS master 
should respond by issuing an interrupt to the CPU to indicate this error. The CPU should then generate a 
STOP condition on the LSS bus to gracefully complete the transaction on the LSS bus. 



ByteO 



ByteM-1 ByieM 



mbyten:!) 



Data(8} 



Dau(8) 



Daa(B) 




S = Start condition 
A = Ack 
N=Nar:lc 
P = Stop condition 
Shaded bits driven by slave 



Figure 62. Example of LSS write to a QA Chip 



1 9.2.4 Read procedure 

The LSS master in SoPEC iiutiates the transaction by generating a START condition on the LSS bus. It 
then transmits the primary id byte with a 1 in bit 0 to indicate that the following command is a read to the 
primary id. An acknowledge is generated by the QA chip corresponding to the given primary id. The LSS 
master releases the Iss^data bus and proceeds to clock the expected number of bytes ftom the QA chip 
with the LSS master acknowledging each successful byte read. The last expected byte is not acknowledged 
by the LSS master It then completes the transaction by generating a STOP condition on die LSS bus. See 
QA Chip Interface Specification for more details on the format of the commands used to communicate 
with the QA chip[8]. 



Byte 0 Byte 1 













s 


n>byter7:l] 


I 

















3 ■- 



BytcM 




S = Stait condition 
AsAck 
N==Nack 
P = Stop oondition 
Shaded bits driven by slave 



Figure 63. Example of LSS read from QA Chip 
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19.3 Implementation 

A block diagram of the LSS master is given in Figure 64. It consists of a block of configuration registers 
that are programmed by the CPU and two identical LSS master units that generate the signalling protocols 
on the two LSS b\iscs as well as intemipts to the CPU. The CPU initiates and terminates transactions on 
the LSS buses by writing an appropriate command to the command register, writes bytes to be transmitted 
I to a fifo and reads bytes received from a fifo, and checks the sources of interrupts by reading status regis- 

ters. 



CPU 



§ 



A 4 4 

i 



Low Speed Serial 
Interface 



2 / 



32 



Y t r Y t 



confrgurafion registers 



I 

5 



32 



LSS busO 
master unit 



16 



I 

1 
^22 



1 



SX32 



32 



LSSbusI 
master unit 



" 2 



r ir 



GPIO 



ICU 



Figure 64. LSS block diagram 
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19.3.1 Definitions of lO 

Table 61. LSS lO pins definitions 



Clocks and Resets 


pc(k 


1 


In 


System Ciod< 


prst_n 


1 


In 


System reset, synchronous active low 


CPU Iriferfaee 


cpo_rwn 


1 


in 


Common read/not-write signal from the CPU 


cpu_adil7:2J 


5 


In 


CPU address bus. Only 6 bits are required to decode the 
address space for this block 


cpu_dataotit(3 1 :0} 


32 


In 


Shared write data bus from the CPU 


cpu_acode[l:0] 


2 


In 

r 


CPU access code signals. 

cpu_acode[0] - Program (0) / Data (1) access 

cpu_acode[1 ] • User (0) / Supervisor (1) access 


cpujss_sel 


1 


In 


Bk>ck select from the CPU. When cpu_lss_se/ls high both 
cpLLfidrand cpiLda&iour are valid 


lss_cpu_rcfy 


1 


Out 


Ready signal to the CPU. When fss^cpu_rcfyts high it indicates 
(he last cyde of the access. For a write cyde this means 
cpu^dataout has been registered by the LSS block and for a 
read cyde this means the data on issjcpujdata is valid. 


lsa_cpu_befr 


1 


Out 


LSS bus error signal to the CPU. 


lss.cpu_data[31 :0] 


32 


Out 


Read data bus to the CPU 


lss_cpij_debufl_vaiid 


1 


Out 


Active high. Indk»tes the presence of valid debug data on 

IssjcpujtiatSL 


GPIO for LSS buses 


(ss_gpio_do[1X)] 


2 


Out 


LSS bus data output 
Bit 0 - LSS bus 0 
Brt 1 * LSS bus 1 


gpio_lss_di[1 :0] 


2 


In 


LSS bus data input 
BitO-LSSbusO 
Bit 1 - LSS bus 1 


ls$.jgp]o.e[1:01 


2 


Out 


LSS bus data output enable, active high 
Bit 0 • LSS bus 0 
Brt 1 - LSS bus 1 


l5S_gpio_dk(1 :0] 


2 


Out 


LSS bus dock output 
Bit 0- LSS bus 0 
Bit 1 - LSS bus 1 


ICU Interface 


lss_icujrq[1.'0] 


2 


Out 


LSS interrupt requests 

Bit 0 - interrupt assodated with LSS bus 0 

Bit 1 - interrupt assodated with LSS bus 1 
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19.3.2 Configuration registers 

The configuration registers in the LSS block are programmed via the CPU interface. Refer to section 1 1.4 
on page 69 for the description of the protocol and timing diagrams for reading and writing registers in the 
LSS block. Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register 
reads and writes, the lower 2 bits of the CPU address bus are not required to decode the address space for 
the LSS block. Table 62 lists the configuration registers in the LSS block. When reading a register that is 
less than 32 bits wide zeros should be returned on the upper unused bit(s) of Issjcpu^data. 

The input cpu^acode signal indicates whether the current CPU access is supervisor, user, program or data. 
The configuration registers in the LSS block can only be read or written by a supervisor data access, i.e. 
when cpujacode equals bll. If the current access is a supervisor data access then the LSS responds by 
asserting lssjcpu_rdy for a single clock cycle. 

If the current access is anything other than a supervisor data access, then the LSS generates a bus error by 
asserting Iss^cpujberr for a single clock cycle instead of lss_cpu_rdy as shown in section n.4 on page 69. 
A write access will be ignored, and a read access will return zero. 



Table 62. LSS Control Registers 




Control regtstere 



0x00 


Retset 


1 


0x1 


A write to this register causes a reset of the i-SS. 


0x04 


LssClockHighPeriod 


16 


oxooca 


High period of /55_c//r expressed as a number of pclk 
cydes. Transmission over the LSS tHJS is at a nominal 
rate of 400kHz« corresponding to a high period of 200 
pcfAr (leOMhz) cycles for a 50/50 duty cyde. 


0x08 


LssQockLowPeriod 


16 


0x0008 


Low period of iss^dk expressed as a numher of pdk 
cydes. Transmission over the LSS bus is at a nominal 
rats of 400kHz, corresponding to a low period of 200 
pctfr (leoMhz) cydes for a 50/50 duty cyde. 


LSS bus 0 reglGters 


0x10 


LssOintStatus 


3 


0x0 


LSS bus 0 interrupt status registers 

Bit 0 - command completed suocessfutly 

Bit 1 - error during processing of command, 

not -acknowledge received after transmission 

of primary Id byte on LSS tMJS 0 
Bit 2 - error during processing of command. ' 

not •acknowledge received after transmissfon 

of data byte on LSS bus 0 
A 1 in a bit of fss0_8tatus^set s\gn^ causes the corre- 
sponding bit in LssOintStatus register to be set. 
AH the bits In LssOintStatus are deared when the 
LssOCmd register gets written to. 
(Read only register) 


0x14 


LssOCurrentState 


4 


0x0 


Gives the current state of the LSS bus 0 state 
machine. (Read only register). 
(Encoding will be specified upon state rrachine Imple- 
mentation) 


0x18 


LssOCmd 


22 


0x00 
.0000 


Command register defining sequeru:e of events to 
perform on LSS txjs 0 before interrupting CPU. 
A write to this register causes all the bits in the 
LssOintStatus register to be deared as well as gener- 
ating a /ssGLneiv.cmd pulse. 
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Table 62. LSS Control Registers 





^^^^^^^^^^ 


^^^1 






0xlC-0x2C 


L8s0fi1b[4:0] 


5x32 


0x0000 
_0000 


LSS Data buffer. Should be filled with transmit data 
before transmit command, or read data bytes received 
after a valid read oommand. 


LSS bus 1 registers 


0x30 


LsslimStatus 


3 


0x0 


LSS bus 1 1nterrupt status registers 

Bit 0 - corrwrand completed successfully 

Bit 1 - error during processing of command, 

not -acknowledge received after transmission 

of primary id byte on LSS bus 1 
Bit 2 ' error during processing of command, 

not -adtnowledge received after transmission 

of data byte on LSS bus 1 
A 1 in a bit of t$$1_status_sets\gna\ causes the corre- 
sponding bit in LssUntStatus register to be set. 
All the bits in LssUntStatus are cleared when the 
LsslCmd register gets written to. 
(Read only register) 


0x34 


LsslCurrentState 


4 


0x0 


Gives the current state of the LSS bus 1 state 
machine. (Read only register) 
(Encoding will be specified upon state machine imple- 
mentation) 


0X38 


LsslCmd 


22 


0x00_ 
0000 


Command register defining sequence of events to 
perform on LSS bus 1 before interrupting CPU. 
A write to this register causes all the bits in the 
LsslfntStatus register to be cleared as well as gener- 
ating a Issl^new^cmd pulse. 


0x3C-0x4C 


Lss1Biiffer(4.*01 


5x32 


0x0000 
.0000 


LSS Data buffer. Should be filled with transmit data 
before transmit comnutnd, or read data bytes received 
after a valid read command. 


Debug registers 


0x50 


LssDebugSel 


5 


0x00 


Selects register for dekxug output. This value is used 
as the input to the register decode logic instead of 
cpu_adrf6'^] when the LSS tKock is not being 
accessed by the CPU. I.e. when cpu_lss_sellsO* 
The output iss_cpu^debugLvaikil& asserted to Indi- 
cate that the data on tss^cpu^data is valid debug 
data. This data can t>e mutiiplexed onto chip ^ns dur- 
ing debug nxKie. 



i9,3.Zi LSS command registers 

The LSS command registers define a sequence of events to perform on the respective LSS bus before issu- 
ing an interrupt to the CPU. There is a separate command register and interrxipt for each LSS bus. The for- 
mat of the command is given in Table 63. The CPU writes to the command register to initiate a sequence 
of events on an LSS bus. Once the sequence of events has completed or an error has occurred, an interrupt 
is sent back to the CPU. 

Some example commands are: 

• a single START condition (Start = 1 . IdByteEnable 0, RdWrEnable « 0, Stop = 0) 

• a single STOP condition (Start = 0, IdByteEnable = 0, RdWrEnable = 0, Stop « 1) 

• a START condition foUov^ed by transmission of the id byte (Start =» 1, IdByteEnable = 1, RdWrEnable 
= 0, Stop = 0, IdByte contains primary id byte) 
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I • a write transfer of 20 bytes from the data buffer (Start « 0, IdByteEnable •?= 0, RdWrEnable = 1, 

RdWrSense = 0, Stop = 0, TxRxByteCount = 20) 
I • a read transfer of 8 bytes into the data buffer {Start = 0, IdByteEnable = 0, RdWrEnable = 1, 

RdWrSense = I , ReadNack = ft Stop = 0, TxRxByteCount = 8) 
♦ a complete read transaction of 16 bytes {Start = IJdByteEnable = 1, RdWrEnable = \, RdWrSense ^ 1 , 

ReadNack = 7, ^/op = 1, IdByte contains primary id byte. rxi2x£[vreCe>x/iir = 16), etc. 

The CPU can thus program the number of bytes to be transmitted or received (up to a maximum of 20) on 
the LSS bus before it gets interrupted This allows it to insert arbitrary delays in a transfer at a byte bound- 
ary. For example the CPU may want to transmit 30 bytes to a QA chip but insert a delay between the 20^ 

I and 21^ bytes sent It does this by first writing 20 bytes to the data buffer. It then writes a command to gen- 
erate a START condition, send the primary id byte and then transmit the 20 bytes from the data buffer. 
When interrupted by the LSS block to indicate successful completion of the command the CPU can then 

I write the remaining 10 bytes to the data buffer. It can then wait for a defined period of time before writing 
a command to transmit the 10 bytes from the data buffer and generate a STOP condition to terminate the 
transaction over the LSS bus. 

An interrupt to the CPU is generated for one cycle when any bit in LssNIntStatus is set. The CPU can read 
LssNImStatus to discover the source of the intcinq)t and can clear a bit in LssNIntStatus by writing a 1 to 
the corresponding bit in LssNIntStatus register. Alternatively the CPU can start a new command which 
will amomatically reset all LssNIntStatus bits. 



Table 63. LSS command register description 





0 


Start 


When 1 , issue a START condition on the LSS bus. 


1 


(dByteEnable 


ID byte transmit enable: 

1 - transmit byte in It/fiyre field 

0 - ignore byte in fdByte field 


2 


RdWrEnable 


Read/write transfer enable: 

0 - ignore settings of RdWrSense, ReadNack and TxRxByteCount 

1 - If RdWrSense is 0, tfien perform a write transfer of TxRxByteCount bytes from the 

data txjfiar. 

if RdWrSense Is 1, then perform a read transfer of TxRxByteCount bytes Into the 
data txjffer. Each byte should be acknowledged and the last byte received is 
acknowledgsd/hot-ecknowledged according to the setting of ReadNack. 


3 


RdWrSense 


Read/write sense indlcalor: 

0 - write 

1 - read 


4 


ReadNack 


Indicates, for a read transfer, whether to issue an adcnowledge or a not-acknowledge 
after the last byte received (indicated tiy TxRxByteCoimt^. 

0 - Issue acknowledge after last byte received 

1 - issue nol>acknow<edge after last byte received. 


5 


Stop 


When 1 . issue a STOP condition on the L^S bus. 


7:6 


reserved 


Must be 0 


15:8 


IdByte 


Byte to be transmitted if IdByteEnable is 1 . Bit 8 con'esponds to the LSB. 


20:16 : 


TxRxByteCount 


Number of bytea to be transmitted from the data buffer or the numt>er of bytes to be 
received into the data buffer. The maximum value that should be programmed is 20« as 
the size of the data tMiffer is 20 bytes. 



The data buffer is implemented in the LSS master block. When the CPU writes to the LssNBuffer registers 
the data written is presented to the LSS master block via the IssNJbuffer^wrdata bus and configuration 
registers block pulses the IssNJmffer^en bit corresponding to the register written. For example if LssN- 
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Buffer [2] is written to lssN_buffer_wen[2] will be pulsed When the CPU reads the LssNBuffer registers 
the configuration registers block reflect the IssNJmfferjrdata bus back to the CPU. 

19.3.3 LSS master unit 

The LSS master unit is instantiated for both LSS bus 0 and LSS bus 1. It controls transactions on the LSS 
bus by means of the state machine shown in Figure 65, which interprets the commands that are written by 
the CPU. It also contains a single 20 byte data buffer used for transmitting and receiving data. 

The CPU can write data to be transmitted on the LSS bus by writing to the LssNBuffer registers.. It can also 
read data that the LSS master unit receives on the LSS bus by reading the same registers. The LSS master 
alwa}^ transmits or receives bytes to or from the data buffer in. the same order. For example a transmit 
command 

For a transmit conunand, LssNBuffer [0] [7:0] gets transmitted first, then LssNBuffer [0][l 5 :S]. LssNBuf- 
fer [OJ [23: 16], LssNBuffer [0] [31:24], UsNBuffer[I][7:0] and so on until TxRxByteCount number of 
bytes are transmitted. A receive command fills data to the buffer in the same order. Each new command die 
buffer start point is reset. 

All state machine outputs^ flags and counters are cleared on reset After a reset the state machine remains 
in the Idle state until lss_cmd_yalid equals 1. If the Start bit of the command is 0 the state machine pro- 
ceeds directly to the CkeckldByteEnable state. If the Start bit is 1 it proceeds to the GenerateStart state 
and issues a START condition on the LSS bus. 

In the CkeckldByteEnable state, if the IdByteEnable bit of the command is 0 the state machine proceeds 
directly to tiie CheckRdWrErmble state. If the IdByteEnable bit is 1 the state machine enters the Sendld- 
Byte state and the byte in the MByte field of the command is transmitted on the LSS, The WaitForldAck 
state is then entered. If the byte is acknowledged, the state machine proceeds to the CkeckRdWrEnable 
state. If the byte is not-acknowledged, the state machine proceeds to the Generatelnterrupt state and issues 
an interrupt to indicate a not-acknowlcdge was received after transmission of the primary id byte. 

In the CheckRdWrEnable state, if the RdWrEnable bit of the corrunand is 0 the state machine proceeds 
directly to the CheckStop state. If the RdWrEnable bit is 1 , count is loaded with the value of the TxRxByte- 
Count field of the command and the state machine enters either the ReceiveByte state if the RdfVrSense bit 
of the corxmiand is 1 or the TransmitByte state if the RdfVrSense bit is 0. 

For a write transaction, the state machine keeps transmitting bytes from the data buffer, decrementing 
count after each byte transmitted, until count is 1. If all the bytes are successfully transmitted the state 
machine proceeds to the CheckStop state. If the slave QA chip not-acknowledges a transmitted byte, the 
state machine indicates this error by issuing an interrupt to the CPU and then entering the Generatelnter- 
rupt state. 

For a read transaction, the state machine keeps receiving bytes into the data buffer, decrementing count 
• after each byte transmitted, until count is L After each byte received the LSS master must issue an 
acknowledge. Afler the last expected byte (i.e. when count is 1) the state machine checks the ReadNack bit 
of the command to see whether it must issue an acknowledge or not-acknowledge for that byte. The 
CheckStop state is then entered. 

In the CheckStop state, if the Stop bit of the command is 0 the state machine proceeds directly to the Gen- 
eratelnterrupt state. If the Stop bit is 1 it proceeds to the GenerateStop state and issues a STOP condition 
on the LSS bus before proceeding to the Generatelnterrupt state. In both cases an interrupt is issued to 
indicate successful completion of the command. 

The state machine then enters the Idle state to await the next commaiid. 

The CPU may abort the current transfer at any time by performing a write to the Reset register of the LSS 
block. 
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19.3.3.1 START and STOP generation 

START and STOP conditions, which signal the beginning and end of data transmission, occur when the 
LSS master generates a ^ling and rising edge respectively on the data while the clock is high. 



In the GenerateStart stale, lss_gpio_clk is held high with Iss^iojs remaining deasserted (so the data line 
is pulled high externally) for LssClockHighPeriod pclk cycles. Then lss_gpiojs is asserted and 
Iss^gpio^do is pulled low (to drive a 0 on the data Une, creating a falling edge) with Iss _gpio_clk remain- 
ing high for another LssClockHighPeriod pclk cycles. 

In the GenerateStop state, both lss_gpio_clk and Iss _gpiojdo are pulled low followed by the assertion of 
Iss^Spioji to drive a 0 while the clock is low. After LssClockLowPeriod pclk cycles, bs_gpi6_clk is set 
high. Alter a further LssClockHighPeriod pclk cycles, lss,^io_e is deasserted to release the data bus and 
create a rising edge on the data bus during the high period of the clock. 



The LSS master holds lss_^iol.clk high while the LSS bus is inactive. A clock pulse is generated for each 
bit transmitted or received over the LSS bus. It is generated by first holding Iss ^^io^clk low for LssClock- 
LowPeriod pclk cycles, and then high for LssClockHighPeriod pclk cycles. 



The input data, gpio_lss_di, is first synchronised to the pclk domain by means of two flip-flops clocked by 
pclk. The LSS master generates a clock pulse for each bit received. The output lss,^io_e is deasserted on 
the falling edge of Lss_gpiojcik to release the data bus. The value on the synchronised gpio_lss^di is sam- 
pled on the rising edge of lss^gpio_clk (the data should be averaged over a further 3 stage register to avoid 
possible glitch detection). The data is only considered as a valid bit at the next falling edge of Iss .^io^clk 
provided a START or STOP is not generated in die meantime. 

In the ReceiveByte state, the state machine generates 8 clock pulses. On each rising edge of Iss ,_gpio^clk 
the synchronised gpiojss^di is sampled. The first bit sampled is LssNBi^er[OJPJ^ the second LssNBuf- 
/erfOJfSJ, etc to LssNBufferfDJfOJ. For each byte received the state machine either sends an NAK or an 
ACK dq)ending on the conunand configuration and the number of bytes received. 

In the SendNack state the state machine g^erates a single clock pulse. Iss^gpio^e is deasserted and the 
LSS data line is pulled high externally to issue a not-acknowledge. 

In the SendAck state the state machine generates a single clock pulse. Iss _gpio_fi is asserted and a 0 driven 
on lss^gpio_do after lss^gpio^clkfd\\xn% edge to issue an acknowledge. 



19.3,3,4 Data transmission 

The LSS master generates a clock pulse for each bit transmitted. Data is output on the LSS bus on the fell* 
ing edge of lss_gpio_clk. 



When the LSS master drives a logical zero on the bus it will assert lss_gpio_e and drive a 0 on 2^5 _gpiojdo 
after Iss^^iojclk falling edge. Iss^gpio^e will remain asserted and Iss^^io^do will remain low until the 
next iss_clk falling edge. 

When the LSS master drives a logical one Iss^gpio^e should be deasserted at lss^gpio_clk falling edge and 
remain deasserted at least imtil the next Iss^io^clk falling edge. This is because die LSS bus will be 
externally pulled up to logical one via a pull-up resistor. 

In the Sendid byte state, the state machine generates 8 clock pulses to transmit the byte in the IdByte field 
of the current valid command. On each falling edge of Iss^gpiojolk a bit is driven on the data bxis as out- 
lined above. On the first falling edge IdByteflJ is driven on the data bus, on the second falling edge 
IdByte[6] is driven out, etc. 



19.3.3.2 dock pulse generation 



19.3.3.3 Data reception 
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In the TransmitByte state, the state machine generates 8 clock pulses to transmit the byte at the output of 
the transmit FIFO. On each falling edge of Iss^iojclk a bit is driven on the data bus as outlined above. 
On the first falling edge LssNBuffer[0][7] is driven on the data bus, on the second falling edge LssNBuf- 
fer[0][6] is driven out, etc on to LssNBufferfOJP] bits. 

In the WaitForAck state, the state machine generates a single clock pulse. On the rising edge of 
lss_gpio_clk the synchronized gpioJss_di is sampled A 1 indicates an acknowledge and ack^detect is 
pulsed, a 0 indicates a not-acknowledge and nackjietect is pulsed. 

i9.3.3.5 Data rate control 

The CPU can control the data rate by setting the clock period of the LSS bus clock by programming appro- 
priate values in LssClockHighPeriod and LssClockLowPeriod, The default setting for both registers is 200 
{pclk cycles) which corresponds to transmission rate of 400kHz on the LSS bus. 
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state machine outputs, Issjcu Jrq and 
LssStatusSet are zero unless otherwise 
indicated. 



Reset OR orrt n«»0 



Start 



tea mti cmrt«-l ftNO 

starts 1 



^eneiateStart) 



Check > 
IdByteEnableJ 



IdBvteEnabfe = 1 









^SendrdByte^ 



nscK detgct^l 
lss_5tatiis.sei{l] « 1 
(ssjcujrq > 1 



.AckOatftCt=il 



RdWignflbfft==:1 AMD 

RdWrSensfl O 
count «TxRxBytoCoun1 




RdvyrEnat?ffl*:=^t AND 

RdWrSensft «n 1 
count «TxRxByteCount 



oQunt> 1 
fXHinl — 



Figure 65. LSS master state machine 
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DRAM Subsystem 



i3 
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20 DRAM Interface Unit (DIU) 



20.1 



Overview 



Figure 66 shows how the DIU provides the interface between the on-chip 20 Mbit embedded DRAM and 
the rest of SoPEC. In addition to outlining the functionality of the DIU, this chapter provides a top-level 
overview of the memory storage and access patterns of SoPEC and the buffering required in the various 
SoPEC bloclcs to suppoit those access requirements* 

The main functionality of the DIU is to arbitrate between requests for access to the embedded DRAM and 
provide read or write accesses to the requesters. The DIU must also implement the initialisation sequence 
and refresh logic for the embedded DRAM. 

The arbitration mechanism is a hierarchical timeslot mechanism providing guaranteed bandwidth and 
latency to each DIU requester, with imused slots re*allocated to provide best efifort accesses. The arbitra- 
tion scheme is fully programmable. 

The interface between the DIU and the SoPEC requesters is similar to the interfoce on FECI i.e. separate 
control, read data and write data busses. 

The embedded DRAM is used principally to store: 

• CPU program code and data. 

• PEP (re)progranmiing commands. 

• Compressed pages containing contone, bi4evel and raw tag data and header information. 

• Decompressed contone and bi-level data. 

• Dotline store during a print. 

• Print setup information such as tag format structures, dither matrices and dead nozzle information. 
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CPU sub-system 
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Figure 66. SoPEC System Top Level partition 
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20.2 IBM Cu-1 1 Embedded DRAM 

20.2.1 Single bank 

SoPEC will use the 1,5 V core voltage option in IBM's 0,13 class Cu-1 1 process. 



The random read/write cycle time and the refresh cycle time is 3 cycles at 160 MHz [16]. An open page 
access will complete in 1 cycle if the page mode select signal is clocked at 320 MHz or 2 cycles if the page 
mode select signal is clocked every 160 MHz cycle. The page mode select signal will be clocked at 320 
MHz in SoPEC. The DRAM word size is 256 bits. 

Most SoPEC requesters will make single 256 bit DRAM accesses (see Section 20.4). These accesses will 
take 3 cycles as they are random accesses i.e. they will most likely be to a different memory row than the 
previous access. 

The entire 20 Mbit DRAM will be implemented as a single memory bank. In Cu-1 1 » the maximum single 
instance size is 16 Mbit The first 1 Mbit tile of each instance contains an area overhead so the cheapest 
solution in terms of area is to have only 2 instances. 16 Mbit and 4Mbit instances would toge&er consume 
an area of 14.63 nim^ as would 2 times 10 Mbit instances. 4 times S Mbit instances would require 17.2 

nrun^. 

The instance size will determine the frequency of refresh. Each refresh requires 3 clock cycles. In Cu-l 1 
each row consists of 8 columns of 256-bit words. This means that 16 Mbit requires 8192 rows. A complete 
DRAM refresh is required every 3.2 ms. This would mean a row would have to be refreshed every 62 
cycles. Two times 10 Mbit instances would require a refresh every 100 clock cycles, if the instances are 
refreshed in parallel. Having 4 times 5 Mbit instances means a refresh is required only every 200 cycles. 

The SoPEC DRAM will be constructed as two 10 Mbit instances implemented as a single memory bank. 
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20.3 SoPEC Memory Usage Requirements 

The memory usage requirements for the embedded DRAM are diown in Table 64'. 



Table 64. Memory Usage Requirements 









Compressed page store 


2046Kt>ytes 


Compressed data page store for 6i-level 
and contone data 


Decompressed Contone 
Store 


108 Kbyte 


1 3824 Ifnes with scale (actor 6 = 2304 pixels, 
store 12 lines. 4 colors = 108 kB 
13824 lines with scale factor 5 = 2765 pixels, 
store 12 tines, 4 colors = 130 kB 


Snot ItnA fifnrA 




13824 dots/line so 3 lines is 5.1 kB 


Tag Rumat Structure 


55 Kbyte (384 dot flne tags 9 
1600 dpi) 

12 Ktiyte (2.5 mm tags Q 800 
dpi) 


55 kB in for 384 dot line tags 

2.5 mm tags (1/10th inch) e 1600 dpi require 

160 dnt linAQ — IftO/lA^ v<«i« nr IrR 
I w uvi iiiivm loufijo^ xoo or «o kd 

2.5 mm tags 9 800 dpi require 80/384 x55 » 
12kB 


OWier Matrix store 


4Kt>ytes 


64x64 dither matrix is 4 kB 
1 28x128 dither matrix is 1 6 kB 
256x256 dither matrix is 64 kB 


ONC Dead Nozzle Tatile 


1.4 Kbytes 


Delta encoded, (10 bit delta position -i- 6 dead 
nozzfe mask) x% Dnozzle 
5% dead nozzles requires (1046)x 692 Dnoz- 
zless 1.4 Kbytes 


Dot-fine store 


319 Kbytes 


Assume each color row is separated by 5 dot 
tines on the print head 
The dot line store will be 045-t-10... 50^55 s 
330 half dot lines ^ 48 extra half dot nnes (4 
per dot row) s 378 half dot Gnes = 31 QKbytes 


PCU Piogram code 


8 Kbytes 


1024 commands of 64 bits s 8 kB 


CPU 


64 Kbytes 


Program code and data 


TOTAL 


2570 Kbytes (12 Kbyte TPS 
storage) 

2613 Kbytes (55 Kbyte TPS) 





Note: 

• Total storage of 2570 Kbytes will be reduced to 2560 Kbytes to align to 20 Mbit DRAM. 
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20.4 SoPEC Memory Access Patterns 

Table 65 shows a summaiy of the blocks on SoPEC requiring access to the embedded DRAM and their 
individual memory access patterns. Most blocks will access the DRAM in single 256-bit accesses. All 
accesses must be padded to 256-bits except for 64-bit CDU write accesses and CPU write accesses. Bits 
which should not be written are masked using the individual DRAM bit write inputs or byte write inputs, 
depending on the foundry. Using single 256-bit accesses means that the buffering required in the SoPEC 
DRAM requesters will be minimized. 



Table 65« Memory access patterns of SoPEC DRAM Requesters 







Single 256-bft reads. 


W 


Sinote 32-bit, 16-bit or 6-fait writes. 


SCB 


w 


Single 256-bit writes. 


CDU 


R 


Single 2S6-blt reads of the compressed oontone data. 


W 


Each CDU aooess is a write to 4 consecutive DRAMwoixteIn the same row 
but only 64 bits of each word are written with the remeuntng bits write 
masked. 

The access time for this 4 wotd page mode burst is 3 -i- 1 ••- 1 +1 =6 cycles 
if the page mode select signal Is clocked at 320 MHz. 


CFU 


R 


Single 256 bit reads. 


LBD 


R 


Single 256 bit reads. 


SFU 


R 


Separate single 256 bit reads for previous and current line but sharing the 
same DIU interfece 


W 


Single 256 bit writes. 


TE(TD) 


R 


Single 256 bit reads. Each read returns 2 times 1 28 bit tags. 


TE(TFS) 


R 


Single 256 bit reads. TPS Is 1 36 bytes. This means there is unused data in 
the fifth 256 bit read. A total of 5 reads is required. 


HCU 


R 


^ngle 256 bit reads. 1 28 x 128 dither matrix requires 4 reads per line with 
double buffering. 256 x 256 dither matrix requires 8 reads at the end of the 
line with single buffering. 

Dither matrices have start address, end address and line advarx» Incre- 
ment 


DNC 


R 


Single 256 bit dead nozzle table reads. Each dead nozzle table read con- 
tains 16 dead<nozzle tables entries each of 10 delta bits plus 6 dead nozzle 
mask bits. 


owu 


W 


Single 256 bit writes since enable/disable DRAM access per cofor plane. 


LLU 


R 


Single 256 bit reads since enable/disable DRAM access per color plane. 


PCU ' 


R 


Single 256 bit reads. Each PCU command is 64 bits so each 256 bit word 
can contain 4 PCU commands. 

PCU reads from DRAM used for reprogrammlng PEP should t>e executed 

with minimum latency. 

If this occurs between pages then there will be free bandwidth as most of 
the other SoPEC Units will not be requesting from DRAM. If this occurs 
between bands then the LDB. CDU and TE bandwidth will be free. So the 
PCU should have a high priority to access to any spare bandwidth. 


Refresh 




Single refresh. 
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20.5 Buffering Required in SoPEC DRAM Requesters 

If each DIU access is a single 256-bit access then we need to provide a 256-bit double buffer in the DRAM 
requester. If the DRAM requester has a 64-bit interface then this can be implemented as an 8 x 64-bit 
HFO. 



l^ble 66. Buffer sizes In SoPEC DRAM requesters 



DRAM 
Requester 


Direction 


Access patterns 


Buffiering required In 
block 


CPU 


R 


Single 256-blt reads. 


Cache. 


W 


Single 32-btt writes but allowtng 16-bit or byte 
addressable writes. 


None. 


see 


W 


Single 256'bit writes. 


Double 
256-bH buffer. 


CDU 


R 


Single 256-b(t reads of the compressed contone 
data. 


Double 2564)it buffer. 


w 


Each CDU access is a write to 4 consetsutive DRAM 
words in the same row but only 64 bits of each word 
are written with the remaining bits write masked. 


Double half JPPG Matit 
buffer. 


CFU 


R 


Singie 256 reads. 


Double 256-bit buffer. 


LBD 


R 


Sinole 256 bit reads. 


C9^rijli DwilCi. 


SFU 


R 


Separate single 256 bit reads for previous and cur- 
rent line txit sharing the same DIU interface 


Double 256-bit buffer for 
each read channel. 


W 


Single 256 bit writes. 


Double 2S6-bit buffer. 


TECTD) 


R 


Single 256 bit reads. 


Dout^e 256-bft buffer. 


TE(TFS) 


R 


Single 256 bit reads. TPS Is 138 bytes. This means 
there is unused data in the fifth 256 bit read. A total 
of 5 reads is required. 


Double line-t>uffer for 136 
bytes implemented In TE. 


HCU 


R 


Single 256 bit reads. 128 x 128 dither matrix 
requires 4 reads per line with double buffering. 256 x 
256 dither matrix requires 6 reads at the end of the 
line with single buffering. 


Configuratife between dou- 
ble 128 byte buffer and 
single 256 byto buffer. 


DNC 


R 


Single 256 bit reads 


Double 2S6-bit buffer. 
Deeper buffering could be 
specified to cope with local 
clusters of dead nozzles. 


DWU 


W 


Single 256 bit writes per enakAed odd/even color 
plane. 


Double 256-bit buffer per 
color plane. 


.LLU 


R 


Single 256 bit reads per enabled odd/even oolor 
plane. 


Double 256-blt buffer per 
color plane. 


PCU 


R 


Single 256 bit reads. Each PCU command is 64 bits 
so each 256 bit DRAM read can contain 4 PCU com- 
mands. Requested command is read from DRAM 
together with the next 3 contiguous 64-blt8 which are 
cached to avoid unnecessary DRAM reads. 


Single 256-blt buffer. 


Refresh 




Single refresh. 


None. 
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20.6 SoPEC DIU Bandwidth Requirements 

Table 67: SoPEC DIU Bandwidth Requirements 



CPU 



w 




see 



w 



7802 



0.326 



0.328 



0.5 



COU 



W 



128(SF=^4).288(SF: 
6), 1:1 compression^ 



32/n2 (SR=n), 
0-9 (SF = 6). 
2(SFa4) 
(1:1 compression) 



32/10*n^ (SF=n), 
0-09 (SF = 6), 
0.2 (SF = 4) 
(10:1 compression)^ 



For Individual accesses: 
16cycles(SF=s4),36 
cycles (SF = 6), nr cycles 
<SF=n). 

Will be Implemented as a 
page mode burst of 4 
accesses every 64 cydes 
(SF = 4), 144(SF =6). 
4*n2(SF=n)cycfes5 



64/n^(SF=n), 
1^(SFs6). 
4(SF = 4) 



32/n2 (SFsn). 
0.9 (SFs6). 
2(SF=:4)* 



1 (SF=6) 
2(SF=4) 



2 (SFe=6) 
4 (SFb4) 



CFU 



32(SF = 4),48(SF=:6)^ 



256 (1:1 compression)® 



32/n (SF^), 
5.4 (SF«6), 
8(SFs4) 



32/n (SF=n). 
5.4(SF«=6}, 
8<SFi=4) 



5,5 (SF=6) 
8(SFs4) 



0.1 (10:1 oonnpressicm)^ 



LBD 



1 (1:1 compression) 



SFU 



W 



128 



10 



256^ 



TECTD) 



252 



12 



1.02 



1.02 



1.25 



TECTFS) 



5 reads per line 



13 



0.093 



0.093 



0.25 



HCU 



4 reads per line for 128 x 
128 dither matrix^* 



0.074 



0.074 



0.25 



DNC 



106 (5% dead-nozzles 
10-bft delta encoded)'*^ 



2.4 (dump of dead 
nozzles) 



0.8 (equally spaced 
dead nozzles) 



2.5 



DWU 



W 



6 writes every 256^* 



LLU 



8 reads every 256^ 



PCU 



256** 



Refresh 



100'» 



2.56 



2.56 



2.75 



TOTAL 



SF « 6: 34 
SF«4:39.5 
exduding CPU 



SF = 6: 27.5 
SF«4: 31.2 
exduding CPU 



SF = 6:35 
exduding CPU. 
SFs4: 40.5 
exduding CPU 



Notes: 

1 : The number of allocated timeslots is based on 64 timeslots each of 1 bit/cycle but broken down to a granularity of 
0.25 bit/cycle. 

2: 50 Mbit/s is 0.328 bits/cycle or 256 bits cvciy 780 cycles. 

3: At 1 :l compression CDU must read a 4 color pixel (32 bits) every SF^ cycles. 
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4: At 10:1 'average compression CDU must read a 4 color pixel (32 bits) every 10*SF^ cycles. 

5: 4 color pixel (32 bits) is required, on average, by the CFU every SF^ (scale factor) cycles. 

The time available to write the data is a function of the size of the buffer in DRAM. 1.5 buffering means 4 color pixel 
(32 bits) must be written every SF^ / 2 (scale factor) cycles. Therefore, at a scale factor of SF, 64 bits are required 
every SF^ cycles. 

Since 64 valid bits are written per 256-bit write (Figure 104 on page 282) then the DRAM is accessed every SF^ 
cycles i.e. at SF4 an access every 16 cycles, at SF6 an access every 36 cycles. 

If a page mode burst of 4 accesses is used then each access takes (3-^1 + 1 +1} equals 6 cycles. This means at SF, a set 

of 4 back'to-back accesses must occur every 4*SF^ cycles. This assumes the page mode select signal is clocked at 320 

MHz, CDU timcslots therefore take 6 cycles. 

For scale factors lower than 4 double buffering will be used. 

6: The average bandwidth 1/2 the peak bandwidth in the case of 1 .5 buffering. 

7: 4 color pixel (32 bits) read by CFU every SF cycles. At SF4, 32 bits is required eveiy 4 cycles or 256 bits every 32 

cycles. At SF6, 32bits every 6 cycles or 256 bits every 48 cycles. 

8: At 1 :1 compression require 1 bit/cycle or 256 bits every 256 cycles. 

9: The average bandwidth required at 10:1 con^iression is 0.1 bits/cycle. 

10: Two separate reads of 1 bit/cycle. 

11: Write at 1 bit/cycle. 

12: Each tag can be consumed in at most 126 dot cycles and requires 128 bits. This is a maximum rate of 256 bits 
every 252 cycles. 

13; 17 X 64 bit reads per line in FECI is 5 x 256 bit reads per line in SoPEC. Double-line buffered storage. 
14: 128 bytes read per line is 4 x 256 bit reads per line. Double-line buffered storage. 

15: 5% dead nozzles 10-bit delta encoded stored with 6*bit dead nozzle mask requires 0.8 bits/cycle read access or a 
256-bit access every 320 cycles. This assumes the dead nozzles are evenly spaced out. In practice dead nozzles are 
likely to be clumped. Peak bandwidth is estimated as 3 times average bandwidth. 
16: 6 bits/cycle requires 6 x 256 bit writes every 256 cycles. 

17: 6 bits/160 MHz SoPEC cycle average but will peak at 2 x 6 bits per 106 MHz print head cycle or 8 bits/ SoPEC 
cycle. The PHI can equalise the DRAM access rate over the line so that the peak rate equals the average rate of 8 bits/ 
cycle. 

1 8: Asstmie one 256 read per 256 cycles is sufficient i.e. maximum latency of 256 cycles per access is allowable. 
19: As an example asstmie refresh must occur every 3.2 ms. Refresh occurs row at a time over 5120 rows of 2 parallel 
to Mbit instances. Each refresh takes 3 cycles. This is equivalent to a timeslot every 100 cycles. 
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20.7 DIU BUS TOPOLOGY 
20.7.1 Basic topoCogy 



Table 66. SoPEC DIU Requesters 



hm4 


mm 




CPU 


CPU 


Refresh 


CDU 


SCB 




CFU 


CDU 




LBD 


SFU 




SFU 


DWU 




TECTD) 






TE(TFS) 






HCU 






DMC 






LLU 






PCU 







Table 68 shows the DIU requesters in SoPEC There are 1 1 read requesters and 5 write requesters in 
SoPEC as compared with 8 read requesters and 4 write requesters in PECL Refresh is an additional 
requester. 

In PECl , the interface between the DIU and the DIU requesters had the following main features: 

• separate control and address signals per DIU requester multiplexed in the DIU according to the aibitra- 
tion scheme, 

• separate 64-bit write data bus for each DRAM write requester multiplexed in the DIU, 

• common 64-bit read bus from the DIU with separate enables to each DIU read requester. 

Timing closure for this bussing scheme was straight-forward in PECl . This suggests that a similar scheme 
will also achieve timing closure in SoPEC. SoPEC has 5 more DRAM requesters but it will be in a 0.13 
um process with more metal layers and SoPEC will run at approximately the same speed as PECl . 

Using 256-bit busses would match the data width of the embedded DRAM but such large busses may 
result in an increase in size of the DIU and the entire SoPEC chip. The SoPEC requestors would require 
double 256-bit wide buffers to match the 256-bit busses. These buffers, which must be implemented in 
flip-flops, are less area efficient than S-deep 64-bit wide register arrays which can be used with 64-bit bus- 
ses. SoPEC will therefore use 64-bit data busses. Use of 256-bit busses would however simplify the DIU 
implementation as local buffering of 256-bit DRAM data would not be required within the DIU. 

20.7.1.1 CPU DRAM access 

The CPU is the only DIU requestor for which access latency is critical. All DIU write requesters transfer 
write data to the DIU using separate point-to-point busses. The CPU will use the cpu_dataout[3 1 :0] bus. 
CPU reads will not be over the shared 64-bit read bus. Instead, CPU reads will use a separate 256-bit read 
bus. 

20.7^ Making more efficient use of DRAIM bandwidth 

The embedded DRAM is 256-bits wide. The 4 cycles it takes to transfer the 256-bits over the 64-bit data 
busses of SoPEC means that effectively each access will be at least 4 cycles long. It takes only 3 cycles to 
actually do a 256-bit random DRAM access in the case of IBM DRAM. 
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20,7.2*1 Common read bus 



If wc have a common read data bus. as in PECl, then if we are doing back to back read accesses the next 
DRAM read cannot start until the read data bus is free. So each DRAM read access can occur only every 4 
cycles. This is sho^vn in Figure 67 with the actual DRAM access taking 3 cycles leaving 1 unused cycle 
per access. 



pclkl 
diu_data(63:0] ( 

rreq(n+l) 

ireq(n+2) ' 



rreq(n+3) ' 
rack(n+l) 

rack(n+2) 

rack(n+3) 



access n 



access n+1 



access n+2 



unused 
cycle 



It 



unused 
cycle 



access 



unused 
cycle 



J L 



J — L 



1 



Figure 67. Shared read bus with 3 cyde random DRAM read accesses 



. 20.7.Z2 Interleaving CPU and non^CPU read accesses 

The CPU has a separate 256-bit read bus. All odier read accesses are 256-bit accesses are over a shared 64- 
bit read bus. fnterleaving CPU and non-CPU read accesses means the effective duration of an interleaved 
access timeslot is the DRAM access time (3 cycles) rather than 4 ^cles. Interleaving is achieved by order- 
ing the DIU arbitration slot allocation appropriately. 

Figure 68 shows interleaved CPU and non-CPU read accesses. 
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Figure 68. Interleaving CPU and non-CPU read accesses 



20,7,2.3 interleaving read and write accesses 

Having separate write data busses means write accesses can be interleaved with each other and with read 
accesses. So now the effective duration of an interleaved access timeslot is the DRAM access time (3 
cycles) rather than 4 cycles. Interleaving is achieved by ordering the DIU arbitration slot allocation appro- 
priately. 

Figure 69 shows interleaved read and write accesses. Figure 70 shows interleaved write accesses. 



pclk 



[■Tjn_rT_|~m_rLf^^ 



256-bit buiferek write 

for SoI^EC Unit n 




diu_data[63 



256-bit buffered write data 
for SpPEC Unit m 



Rgure 69. Interleaving read and write accesses with 3 cycle random DRAM accesses 



Write data stil! takes 4 cycles to transmit over 64*bit busses so 256-bit buffers are required in the DIU to 
gather the write data from the requesters. 
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Figure 70. interleaving write accesses with 3 cycle random DRAIM accesses 



20.7.3 Bomnridtlis y 



Table 69. SoPEC DfU Requesters Data Bus Width 





CPU 


256 (separate) 


CPU 


32 (OPEN ISSUE) 


CDU 


64 (stared) 


SCB 


64 


CFU 


64 (shared) 


CDU 


64 


LBO 


64 (shared) 


SFU 


64 


SFU 


64 (shared) 


DWU 


64 


TE(TD) 


64 (shared) 






TE(TFS) 


64 (shared) 






HCU 


64 (shared) 






DNC 


64 (shared) 






LLU 


64 (shared) 






PCU 


64 (shared) 







20.7,4 Conciuslons 

Reads and writes can be interleaved with a separate 256-bit read bus for the CPU for m ini mum latency 
DIU access. Interleaving can be performed by inserting write accesses or CPU accesses between shared 
read bus accesses. The interleaving is achieved by ordering the DIU arbitration slot allocation appropri- 
ately. 
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20.8 



SoPEC DRAM ADDRESSING SCHEME 



The embedded DRAM is composed of 256-bit words. However the CPU-subsystem may need to write 
individual bytes of DRAM. Therefore it was decided to make the DIU byte addressable. 22 bits are 
required to byte address 20 Mbit of DRAM. 

Most blocks read or write 256 bit words of DRAM. Therefore only the top 17 bits i.e. bits 21 to 5 are 
required to address 256-bit word aligned locations. 

The exceptions are 

• CDU which can write 64-bits so only the top 19 address bits i.e. bits 21-3 are required 

• CPU writes can be 8, 16 or 32-bits. The cpu^diujMnaskfl. OJ pins indicate whether to write 8, 16 or 32 
bits. 

All DIU accesses must be within the same 256-bit aligned DRAM word. 
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20.9 DIU Protocols 

The DIU protocols are 

• pipelined i.c the following transaction is initiated while the previous transfer is in progress. 

• split transaction i.e. the transaction is split into Independent address and data transfers. 



20.9,1 Read Protocol except CPU 

The SoPEC read requestors, except for the CPU, perform single 256-bit read accesses with the read data 
being transferred from the DIU in 4 consecutive cycles over a shared 64-bit read bus, diu^data[63:0J. The 
read address <unit>_diu^adr[21:5] is 256-bit aligned. 

The read protocol is: 

• <unit>^diu^rreq is asserted along with a valid <unU>_diu^radr[21:5J, 

• The DIU acknowledges the request with diu_<unit>_jack The request should be deasserted The min- 
imum number of cycles between <unU>_diu_rreq being asserted and the DIU generating an 
diu_<unit>_rack strobe is 2 cycles (1 cycle to register the request, 1 cycle to perfonn the arbitration - 
see Section 20.13.6). 

• The read data is returned on diu^data[63:0] and its validity is indicated by diu_<unU>_rvalid. 

• When four diu^<unU>_rvalid pulses have been received then if there is a further request 
<unit>_diu^rreq should be asserted agairL diu_<unU>_rvalid v/iil be always be asserted by the DIU 
for four consecrative cycles. The first diu^<unit>^alid pulse will occur 3 cycles after. 
diu^<umt>^ack (1 cycle to transfer the address to the DRAM, 2 cycles for the read data to be 
returned from the DRAM). 



pclk 

<unit>_diu_nrcq 
diu_<Unit>_rack 



J — L 



<imit>_diu_radr[2 1 :5] | 1 
diu_<unit>_rvalid 
diu_data[63:0] 



2 I 3 I 4 I 



Figure 71. Read protocol for a SoPEC Unit making a single 25G-bit access 



20.9.2 Read Protocol for CPU 

The CPU performs single 256-bit read accesses with the read data being transferred from the DIU over a 
dedicated 256-bit read biis for DRAM data, dram_cpu_dataf255:0J. The read address cpu_adrf2I:5J is 
256-bit aligned. 

The CPU DIU read protocol is: 

• cpu_diu_rreq is asserted along with a valid cpu_adr[21:S], 
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• The DIU acknowledges the request with diu_cpu_rack. The request should be deasserted. The mini- 
mum number of cycles between cpujdiu^rreq being asserted and the DIU generating a cpu^diu^rack 
strobe is 2 cycles (1 cycle to register the request, 1 cycle to perform the arbitration - see Section 
20.13,6). 

• The read data is returned on dram_cpu_data[255:0] and its validity is indicated by diu^cpu_rvalid, 

• When the diu_cpu^rvalid pulse has been received then if there is a further request cpu^diu^rreq should 
be asserted again. The diu_cpu_rvahd pulse will occur 3 cycles after rack (1 cycle to transfer the 
address to the DRAM, 2 cycles for the read data to be returned from the DRAM). 



pclk 

cpu_diu_rreq 
diu_cpu_rack 




cpu_adr(2l:5] | . | J^, 



diu_cpu_rvalid 



dram-cpu_datat255;0] [ 

Figure 72. Read protocol for a CPU making a single 256-bit access 




~r-T 



20.9.3 Write Protocol except CPU and COU 

The SoPEC write requestors, except for the CPU and CDU, perform single 256-bit write accesses with the 
write data being transferred to the DIU in 4 consecrative cycles, over dedicated point*to-point 64-bit write 
data busses. The write address <unit>_diu_wadr[2I:5J is 256-bit aligned. 

The write protocol is: 

• <unit>jdiu_wreq is asserted along with a valid <umt>_diu^adr[21:5J. 

• The DIU acknowledges the request with dw^<unit>_wach The request should be deasserted The 
minimum number of cycles between <unit>_diu_wreq being asserted and the DIU generating an 
diu_<unU>_wack strobe is 2 cycles (1 cycle to register the request, 1 cycle to perform the arbitration - 
see Section 20.13.6). 

• In the clock cycles following wack the SoPEC Unit outputs the <unit>_diu_data[63:0], asserting 
<unit>^diu^wvalid. Write data should be output as soon as possible after receiving the wact Access- 
ing registers, register arrays or SRAMs may incur different delays. The first <unit>_diu_wvalid pulse 
can occur in the clock cycle after diu_<unit>_wack. In the case of register array or SRAM access, the 
first <unit>_diu_Mrvalid pulse will occiff 2 clock cycles after diu_<unit>_wack. 

• Once all the write data has been output then if there is a further request <unit>_diu_wreq should be 
asserted again. 

A timeout mechanism will be implemented to ensure that the DIU will not lock-up if four 
<unit>^diu^wvalid pulses are not provided. 
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<umt>_diu_wadr[21:5] [ 



diu_<unit>_wack 



<umt>_diu.data[63:0] | . •. I 1 J 2 | 3 | 4 | 

<linit>_diu_wvalid I I 



Figure 73. Write Protocol shown for a SoPEC Unit making a single 256-bit access 



20,9,4 CPU Write Protocol 

The CPU perfonns single write which can be 8, 16 or 32-bits with the write data being transferred to the 
DIU over the epu_dataaut[3I:0] bus. The write address cpu_adr[2l:0J is byte aligned. 

The CPU write protocol is: 

• cpujtiu^wreq is asserted along with a valid cpujadr[21:0J and a write mask cpu_diu_wmask[1 :0] to 
indicate whether an 8, 1 6 or 32*bit access is required. 

• The DIU acknowledges the request with diu_cpu_wack The request should be deasserted. The mini- 
mum number of cycles between cpu^diu^wreq being asserted and the DIU generating an 
diu_cpu_wack strobe is 2 cycles (1 cycle to register the request, I cycle to perform the aibitiation - see 
Section 20.13.6). 

• In the clock cycle following diu_cpu_wack the CPU outputs the cpu_dataautpj:OJ, asserting 
cpu_diu_wvaJid. Write data should be output as soon as possible after receiving the diu^cpu^ymck. 
The earliest the cpu_diu_wvalid pulse can occur is in the first clock cycle after diu^cpu^wack, 

• Once the write data has been output then if there is a further request cpu^diu_wreg should be asserted 
agaia 
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pclk 




cpu_diu_wreq | | 

q)u_adr(21:0] \. I |; ■ 

cpu_diu^wmask[l:0] |.. | > '; - ~ 



diu_cpu_wack 



cpu.dataout[31:0] ■ / " ' :| |^' 

cpu_diu_wvalid | I 

Figure 74. Write Protocol shown for a CPU making an 8, 16 or 32-bit access 



20.9.5 CDU Write Protocol 

The CPU performs four 64-bit writes to 4 contiguous 256-bit DRAM addresses with the first address spec- 
ified by cdu_diu_wadr[2I:3]. The write address cdu_diu_wadr[22:3J is 64-bit aligned 

The write protocol is: 

• cdu_diu_wdata is asserted along with a valid cdujdiu^adr[2I:3]. 

• The DIU acknowledges the request with diu_cdu_yvach The request should be deasserted The mini- 
mum number of cycles between cdujdiujwreq being asserted and the DIU generating an 
diu_cdu^wack strobe is 2 cycles (1 cycle to register the request, I cycle to perform the aibitration - see 
Section 20.13.6). 

• In the clock cycles following wackth^ CDU outputs the cdu_diu_data[63:0], together with asserted 
cdu^diu__wvalid. Write data should be output as soon as possible after receiving the wacJL Accessing 
registers, register arrays or SRAMs may incur different delays. The first cdujdiu_wvalid pulse can 
occur in the clock cycle after diu_cdu_yvach In the case of register array or SRAM access, the first 
cdu_diu_wvalid pulse will occur 2 clock cycles after diu_cdu^ack. 

• Once all the write data has been output then if there is a fuz^er request cdu_diu_wreq should be 
asserted again. 

A timeout mechanism will be implemented to ensure that the DIU will not lock-up if four cpu^diu^wvalid 
pulses are not provided. 
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J3 



pclk 

cdu_diu_wreq 
cdu_diu_wadr[22:3] 
diu_cdu_wack 



cdu_diu_data[63:0] 
cdu_diu_wvalid 



J — L 



•I 1 I 2 I 3 j-rr 



Figure 75. Write Protocol shown for CDU making four contiguous 64-bit accesses 
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20.10 DIU ARBITRATION MECHANISM 

The DIU will arbitrate access to the embedded DRAM. The arbitration scheme is outlined in the next sec* 
tions. 

20.10.1 Tinneslot based arbitration scheme 

Table 67 sxmunarised the bandwidth requirements of the SoPEC requestors to DRAM. If we allocate the 
DIU requestors in terms of peak bandwidth then we require 36 bits/cycle (at SF =6) and 42,5 bits/cycle (at 
SF = 4) for all the requestors except the CPU. 

A timeslot scheme is defined with 64 main timeslots. The number of used main timeslots is progranmiable 
between 0 and 64. 

Since DRAM read requestors, except for the CPU, are connected to the DIU via a 64-bit data bus each 
256-bit DRAM access requires 4 pclk cycles to transfer the read data over the shared read bus. The 
timeslot rotation period for 64 timeslots each of 4 pclk cycles is 256 pclk cycles or 1 .6 ps, assuming pclk is 
160 MHz. Each timeslot represents a 256-bit access every 256 pclk cycles or 1 bit/cycle. This is the granu- 
larity of the majority of DIU requestors bandwidth requirements in Table 67. 

The SoPEC DIU requesters can be represented using 5 bits (Table on page 229). Using 64 timeslots 
means that to allocate each timeslot to a requester a total of 64 times 5 configuration registers is required 
for the 64 main timeslots. 

Timeslot based arbitration works by having a pointer point to die current timeslot When re-arbitration.is 
signaled the axbitradon pointer will advance to the next timeslot. If tfie SoPEC Unit assigned to the cuirent 
timeslot is not requesting then the iinused timeslot arbitration mechanism outlined in Section 20.10.4 is 
used to select the arbitration winner. 

The timeslot pointer advances when the DIU issues the next conunand to the DRAM. Each timeslot there^ 
fore denotes a single access. The duration of the timeslot depends on the access. 

If the SoPEC Unit pointed to by the current timeslot pointer is not requesting then the slot will be allocated 
according to the mechanism described in Section 20.10.5. 



current timeslot 
pointer 



► 






n-l 


n 


n+l 

























Figure 76. Timeslot based arbitration 

20.10.2 Separate read and write arbitration windows 

For write accesses, except the CPU, 256-bits of write data are transferred from the SoPEC DIU write 
requestors over 64-bit write busses in 4 clock cycles. This write data transfer latency means that writes 
accesses, except for CPU writes, must be arbitrated 4 cycles in advance. The [to be included figure and 
explanation] shows why this is necessary. 
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S3 



Since write arbitration must occur 4 cycles in advance, and the minimum duration of a timeslot duration is 
3 cycles, the arbitration niles must be modified to initiate write accesses in advance.accordingly. There is 
a timeslot iookahead pointer shown in Figure 77 two timcsiots in advance of the current timeslot pointer. . 



current timeslot 
pointer 



n+1 



timeslot Iookahead 
pointer 



Figure 77. Timeslot based arbitration with separate read and write pointers 
The following examples illustrate separate read and write timeslot arbitration. 



W 
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R 



W 



Programmed timeslot order 



W 



W 



W 



w 



w 



Timeslot aibitiation order 



Actual timeslot order 



write 
latency 



Figure 78. Example (a), separate read and wrrite arbitration 

In Figure 78 writes are arbitrated two timeslots in advance. Reads are arbitrated in the same cycle Writes 
can be arbitrated in the same cycle as a read. During aibitradon the command address of the arbitrated 
SoPEC Unit is captured. 

Other examples are shown in Figure 79, Figure 80 and Figure 81. The actual timelsot order is always die 
same as the programmed timeslot order i.e. out of order accesses do not occur and data coherency is never 
an issue. 

Each write must always incur a latency of two timeslots. If the first write occurs in the firet timeslot then 
all following timeslots will incur a latency of two timeslots. This is shown in Figure 78 and Figure 79. If 
the first write occurs in the second timeslots then all following timeslots will incur a latency of two 
timeslots. This is shown in Figure 80. 
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Tuneslot aibitration order 
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Actual timeslot order 
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Figure 79. Example (b), separate read and write arbitration 
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Figure 80. Example (c), separate read and write arbftratton 
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R 



W 



w 



Programmed timeslot oreder 



R 
W 


R 


W 


R 




















R 


R 


w 


R 


W 









Timeslot arbitration order 



Actual timeslot order 



initial write 
latency 

Figure 81. Example (d), separate read and write arbitration 



Table 70 shows the 4 scenarios depending on whether the current timeslot and timeslot lookahead pointers 
point to read or write accesses. 

To be checked and updated: 

Table 70: Arbitration with separate windows for read and write accesses 











read 


write 


Initiate read transfer. 






Initfate write transfer. 


readl 


read2 


Initiate readl transfer 


write 1 


write2 


Initiate write2 transfer. 


write 


read 


No action. 



If the current timeslot pointer points to a read access then this will be initiated immediately. 

If the timeslot lookahead pointer points to a write access then this access is initiated immediately, or 
immediately after the read access associated with the current dmeslot pointer is initiated. 

When a write access is initiated the DIU will capture the write address and will do the DRAM write two 
tiemslots in advance when the associated write data has been transfered to the DIU. 

To be checked and updated: At initialisation, both pointers point to the first timeslot The lookahead 
pointer advances to the second timeslot and the third timeslot in successive clock cycles until it is two 
timeslots ahead of the current timeslot pointer. Then both pointers advance in tandem. At each step, the 
rules in Table 70 are ob^ed. This leads to the behaviour shown in the exampes of Figure 78 to Figure 81 . 
CPU write accesses arc excepted from the lookahead mechanism. 
Timing diagrams for these scenarios are shown in Section 20.13 Implementation. 
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If the selected SoPEC Unit is not requesting then there will be separate read and write selection for unused 
timeslots. This is described in Section 20. 1 0,5. 

20.10.3 Arbitration of CPU accesses 

The CPU can be allocated timeslots like any other DIU requestor. If CPU accesses are interleaved between 
the shared read bus accesses then the DIU timeslots will take 3 cycles as shown in Section 20.7.2.2. The 
timeslot rotation will be faster than 256 pclk cycles. 

What distinguishes the CPU from other SoPEC requestors, is that the CPU requires minimum latency 
DRAM access i.e. preferably the CPU should get the next available timeslot whenever it requests. 

The niinimum CPU read access latency is estimated in Table 71, This is the time between the CPU making 
a request to the DIU and receiving the read data back from the DIU. This ignores any latency associated 
with the CPU's caching mechanism. 

Table 71. Estimated CPU read access latency Ignoring caching 







register the CPU read 
request 


1 cycle 


complete the arbitra- 
tion of the request 


1 cycle 


transfer the read 
address to the DRAM 


1 cycle 


DRAM read latency 


2 cycles 


register the read data 


1 cycle 


TOTAL 


6 cycles 



If the CPU, as is likely, requests DRAM access again immediately after receiving data from the DIU then 
the CPU can access every second timeslot. This assumes that interleaving is employed so that timeslots 
last 3 cycles. If the CPU access latency increases to 7 cycles then the CPU will only be able to access every 
third timeslot. 

If a cache hit occurs the CPU does not require DRAM access. For its next DIU access it will have to wait 
for its next assigned DIU slot. Cache hits therefore will reduce the number of DRAM accesses but not 
speed up any of those accesses. 

To avoid the CPU having to wait for its next timeslot it is desirable to have a mechanism for ensuring that 
the CPU always gets the next available timeslot without incurring any latency on the non-CPU timeslots. 
This can be done by defining each timeslot as consisting of a CPU access preceding a non-CPU access. 
Each timeslot will last 6 cycles i.e. a CPU access of 3 cycles and a non-CPU access of 3 cycles. This is 
exactly the interleaving behaviour outlined in Section 20.7.2.2. If the CPU does not require an access, the 
timeslot will take 3 or 4 and the timeslot rotation will go faster. A summary is given in Table 72. 



Table 72, TImesrot access times. 









CPU access + non-CPU access 


3 + 3 = 6 cycles 


Interleaved access 


non-CPU access 


4 cycles 


Access and preceding access both to shared 
read bus 
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Table 72. Tlmeslot access times. 









non-CPU access 


3 cycles 


Access and preceding access not both to shared | 
read bus | 


CDU write access 


3-1-1+1+1 =6 cycles 


Page mode select signal is clocked at 320 MHz | 



CDU write accesses require 6 cycles. CDU write accesses preceded by a CPU access require 9 cycles. 
CDU titneslots therefore take longer than all other DIU requestors timeslots. 

With a 256 cycle rotation there can be 42 accesses of 6 cycles. This is just enough timeslots for SF = 4 
operation, ignoring implementation pipeline latencies. 

For low scale factor applications, it is desirable to have more timeslots available in the same 256 cycle 
rotarion. So two counters of 4-bits each are defined allowing the CPU to get a maximum of cpu^timeshts 
in total^dmeslots. A timeslot counter starts at totaljtimeslots and decrements every timeslot, while another 
counter starts at cpujtimeslots and decrements every timeslot in which the CPU uses its access. When the 
CPU timeslot counter goes to zero before totaljtimeslots no further CPU accesses are allowed When the 
totaljtimeslots counter reaches zero both counters arc reset to their respective initial values. 

When cpujtimeslots is set to zero then no accesses will be preceded by CPU accesses. The CPU can be 
allocated timeslots like any other DIU requestor. 

If CPU accesses are interleaved between the shared read bus accesses then the DIU timeslots will take 3 
cycles as shown in Section 20.7.2.2 Otherwise the timeslots will take 4 cycles each and the rotation will 
take 256 cycles. 

The various modes of operation are summarised in Table 73 with a nominal rotation period of 256 cycles. 



Table 73. CPU timeslot allocation modes with nomfnal rotation period of 2% cycles 







i if.' 3A«^53i_;}.f_ 




CPU Pre-access 

i.e. t^jtimeslots totaljtimeslots 


6 cycles 


42 timeslots 


Each access is CPU + non-CPU. 

If CPU does not use a timesiot then rotation fe Caster. 


Fractional CPU 
Pre-access 

i.e. cpu_timeslots < totaljtimeslots 


4 or 6 cycles 


42-64 timeslots 


Each CPU + non-CPU access reciuires a 6 cycle 
timeslot. 






Individual non-CPU timeslots take 4 cycles if 
current access and preceding access are both 
to shared read bus. . 








Individual non-CPU timeslots take 3 cycles if 
current access and preceding access are both 
to shared read bus. 


Interleaved 

l.e. cpu timeslots = 0 


4 cycles 


64 timeslots 


Timeslot rotarion is faster by 1 cycle for each 
CPU, write access or interleaved read access 



20.10.4 Sub-timeslots 

Looking at the bandwidth requirements of the DIU requesters in Table 67, most DIU requesters require 
bandwidths of I bit/cycle or multiples thereof However, some of the requestors require much lower band- 
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width. This suggests that some sub-timeslots of lower granularity than a nominal 1 bit/cycle should be 
defined 

There will be 2 sub-timeslots of 4 and 8 slots each. The bandwidth associated with each individual sub-slot 
is nominally 0.25 and 0.125 bits/cycle respectively, assuming each slot last 4 cycles. Sub-timeslots can be 
allocated to any number of main timeslots so that any multiple of the individual sub-timcslot bandwidth 
can be obtained. 

Table 74. Sub-tlmeslot definition 









Sub4timeslot 


4 


0.25 bits/cycle 


SubStimeslot 


8 


0.125 bits/cycle 



Each sub-slot pointer gets advanced each time it is accessed regardless if it slot is used or not 
Sub-timeslots arc similar in all other ways to main timeslots i.e. 

• they can have preceding CPU accesses in a similar manner. 

• unused slots are decided by the same unused timeslot allocation mechanism (Section 20. 1 0.5). 

• a timeslot lookahead pointer is used to select writes (except for CPU writes) early to compensate for 
write data transfer latency. 



current timeslot 
pointer 





m 






n 


n+1 


n+2 








P 






1 ► 




i 


/ / 







sub4timeslot 



SubStimeslot 



Figure 82. Example sub-timeslot allocation 

An example sub-timeslot allocation is shown in Figure 82. 

Every time main timeslots m and « are accessed, the SoPEC unit pointed to by the pointer in sub4timeslot 
will win aibitration and the sub4timeslot pointer will advance. Similarly, every time main timeslots «+2 
and/? are accessed, the SoPEC unit pointed to by the pointer in subStimeslot will win arbitration and the 
Ji/^Jrimes/o/ pointer will advance. 

20.10.5 Allocating unused timeslots 

Unused slots are re-allocated on a two-level round-robin basis. This is best-effort traffic. 
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Each SoPEC requestor has two associated bits, RoundRobinLevel indicates whether it is in level 1 or level 
2 round-robin, and RoundRohinEnable indicates whether it is enabled or not in the selected round-robin. 

Table 75. Round-robin selection 













RoundRobinLevel = 0 


RoundRohinEnable = 


= 0 


Not enabled 




RoundRohinEnable = 1 


Level 1 


RoundRobinLevel = 1 


RoundRohinEnable 0 


Not enabled 




RoundRohinEnable ^ 1 


Level 2 



Separate read and write round-robin trees are needed, one for read accesses and one for write accesses. 

CDU write accesses cannot be included in the round-robin allocation for write as CDU accesses take 6 
cycles. The write accesses which the CDU write could otherwise replace lequire only 3 or 4 cycles. 

Robin-robin allocations do not have CPU pre-accesses. 

A pointer points to the current allocated unit in each of the round-robin levels. If the unit pointed to the 
level 1 round-robin is requesting then this unit wins the arbitration and the pointer is advanced If the unit 
pointed to in the level 1 round-robin is not requesting then the next units in the level 1 round-robin are 
examined When a requesting unit is found this unit wins the arbitration and the pointer advances to the 
next unit. If no unit is requesting then the pointer does not advance and the second level of round-robin is 
exanuned in the same way as first level of the round-iobin. 



Table 76. Write round-robin registers bit order 







CPU(W) 


0 


SCB 


1 


SFU(W) 


2 


owu 


3 



20.10.6 Background refresh controller 

A background refresh controller should be implemented that will issue a refresh and pause the timeslot 
rotator in case data is about to be lost This scenario will only occur in the situation that insufficient 
timeslots were allocated for refresh. 
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20.11 Guidelines for programming the DIU 

Some guidelines for programming the DIU arbitration scheme are given in this section together with an 
example. 

20.11.1 Implementation pipeline latencies 

The number of allocated timeslots for each requester needs to take into account implementation pipeline 
latencies. The number of timeslots is made programmable. This means 1 or 2 timeslots can be removed to 
allow for implementation latency. Each timeslot wiU allow for 6 cycles implementation latency in CPC/ 
Pre-access mode and 3 cycles otherwise for each single timeslot allocation in a rotation.. If units are allo- 
cated more than 1 timeslot in a rotation then the gap between slots may need to be reduced additionally to 
allow for implementation latency. 




20.1 1 Jl Ensuring sufficient DNC and PCU access 

PCU conmiand reads from DRAM are exceptional events and should complete in as short a time as possi- 
ble. Similarly, we must ensure there is sufficient free bandwidth for DNC accesses e.g. when clusters of 
dead nozzles occur. In Table 67 DNC is allocated 3 times average bandwidth. PCU and DNC can also be 
allocated to the level 1 round-robin aUocation for unused timeslots so that unused timeslot bandwidth is 
available to them. 



20.1 1 .3 Basing timeslot allocation on peak bandwidths 

Since the embedded DRAM provides sufficient bandwidth to use 1:1 compression rates for the CDU and 
LBD, it is possible to simplify the main timeslot and sub-timcslot allocation by basing the. allocation on 
peak bandwidths. The only variable in determining timeslot allocations then becomes the scale factor. 
If slot allocation is based on peak bandwidth requirements then DRAM access will be guaranteed to all 
SoPEC requesters. If we do not allocate slots for peak bandwidth requirements then we can also aUow for 
the peaks deterministicalfy by adding some cycles to the print line time. 

20.11.4 Adjacent timeslot llmttatlons 

All DIU requesters have state-machines which request and transfer the read or write data before requesting 
again. The time to perform this operation is greater than the time between adjacent timeslots. Therefore 
adjacent timeslots should not be assigned to a particular DIU requester because the requester will not be 
able to make use of all these slots. 

20.11.5 Line margin 

The SFU must output 1 bit/cycle to the HCU, Since HCUNumDots may not be a multiple of 256 bits the 
last 256-bit DRAM word on the line can contain extra zeros. In this case, the SFU may not be able to pro- 
vide 1 bit/cycle to the HCU. This could lead to a stall by the SFU. This stall could then propagate if the 
margins being used by the HCU are not sufficient to hide it. The maximum staU can be estimated by the 
calculation: DRAM service period - X scale factor * dots used from last DRAM read for HCU line. 
Similarly, if the line length is not a multiple of 256-bits then e.g. the LLU could read data from DRAM 
which contains padded zeros. This could lead to a stall. This stall could then propagate if the page marcins 
cannot hide it. 

A single addition of 256 cycles to the line length will suffice for all DIU requesters to mask these stalls. 
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20.1 1.6 Example DIU programming 

A fiill example to be worked out. 

Prograin MainTimesIor and SubnTtmeslot configuration registers (Table 82) for peak required bandwidths 
of SoPEC Units according to the scale factor used for the document. 

Program unused slots to use the round-robin allocation to share unused slots between all DIU requesters. 
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20-12 CPU DRAM ACCESS PERFORMANCE 

This section does not yet reflect any implementation pipeline latencies. 

The CPU's share of the timeslots can be specified in terms of guaranteed bandwidth and average band- 
width allocations. 

The CPU's access rate to memoiy depends on 

• the CPU read access latency i.e. the time between the CPU making a request to the DIU and receiving 
the read data back from the DRJ. 

• how often it can get access to DIU timeslots. 

Table 71 estimated the CPU read latency ignoring caching as 6 cycles. 

How often the CPU can get access to DIU timeslots depends on the access type. 



Table 77. CPU DRAM access perfonnance 













CPU Pre-access 


6 cycles 


Lower bound (guaranteed 

bandwidth) is 

160 MHz/6 = 26.27 MHz 


CPU can access every timeslot 


Fractional CPU 
Preraccess 


6 cycles 


Lower bound (guaranteed 
bandwidth) is 
(160MHz*N/P) 


CPU accesses precede a fraction N of timeslots 
where N = C/T. 
C = cpu^timeslots 
T = totaljtimeslots 
^--{6''C+4*Cr'C))/T 


Interleaved 


4 cycles 


See Section 20. 12.1 


At SF » 6, 28 timeslots available for CPU. 
At SF = 4, 21 timeslots available for CPU. 



For CPU Pre-^iccess and Fractional CPU Pre-access modes average and guranteed CPU bandwidth arc 
equivalent since the CPU is limited to a certain fraction of timeslots. 

If the CPU runs out of its instruction cache then instruction fetch performance is only limited by the on- 
chip bus protocol. With a 2 cycle bus protocol (address cycle + data cycle) the performance would be 80 
MHz. 



20.12.1 CPU DRAM access performance with interleaved access mode 

Table 78 shows the guaranteed periodic CPU access with 4 cycle DRAM access and pclk » 160 MHz. 

Table 78. Guaranteed Periodic CPU access wtth 4 cycle timeslots and pcfi^s 160 MHz 





mm 


!^^^» 


Timeslots left for CPU 


26,25 


21.5 


Maximum wait for timeslot 


12 cycles 


12 cycles 


CPU rate 


13.3 MHz 


13^ MHz 



Since timeslots are integral multiples of 4 cycles the maximum wait for a timeslot and hence minimum the 
CPU rate must reflect this. 
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Table 79 shows the average CPU access with 4 cycle DRAM access and pclk = 160 MHz, This will be a 
bursty access. 

Table 79, Average bursty CPU access with 4 cycle DRAM access and pclks 160 MH* 









Timestots left for CPU 


34.95 


30.8 


Maximum wait for limeslot 


Scydes 


1 2 cycles 


CPU rate 


20 MHz 


13.3 MHz 



Interleaving of CPU and write accesses with shared read bus accesses will mean some of the timeslots will 
take 3 cycles rather than 4 cycles. This will mean that CPU slots will be available more frequently and 
higher CPU performance is attainable. 
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20. 1 3 Implementation 

The DRAM Interface Unit (DIU) is partitioned into 2 logical blocks to facilitate design and verification. 

a. The DRAM Access Unit (DAU) which interfaces to the SoPEC DIU requesters. 

b. The DRAM Controller Unit (DCU) which accesses the embedded DRAM. 



SoPEC 


1 

1 . 


Units 


1 
1 




1 
1 
1 



DRAM Access Unit (DAU) 




DRAM 




eDRAM 






Controller 










Unit 










(DCU) 







Figure 83. DIU Partition 

The basic principle in design of the DIU is to ensure that the eDRAM is accessed at its maximum rate 
while keeping the circuit latency for each access as low as possible. 

The DCU is designed to interface with single bank 20 Mbit IBM Cu-1 1 embedded DRAM performing 
random accesses eveiy 3 cycles. Page mode write accesses, associated with the CDU, are also si^ported. 

The DAU is designed to support interleaved accesses allowing the DRAM to be accessed every 3 cycles 
where back-to-back accesses do not occur over the shared 64-bit read data bus. 
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20.1 3.1 Definition of DCU lO 



Table 80. DCU fnterface 







m. 




Clocks and Resets 


pclk 


1 


In 


SoPEC Functional dock 


prst_n 


1 


tn 


Active-low. synchronous reset in pdk doniain 


Inputs from OAU 


dau_dcu.cmdavail 


1 


In 


Signal Indicating a OAU command Is available I.e. 
dau^cmd^adr, dau^cmd^fwn and c/au^cmd_fefrBsh are vafid. 


<lau.dcu.cmdadr(21 :S] 


17 


In 


Signal Indicating the address for the DRAM aocessl This is a 
256-bit aligned DRAM address. 


dau_dcu_cmdrwn 


1 


In 


Signal indicating the direction for the DRAM access (1 =read. 
0=write). 


dau_dcu.cmdrefre$h 


1 


In 


Signal indicating that a refresh command is to be issued. If 
asserted d!au.cmd_8dSr arid dai/_cr7KLmn will be Ignored. 


daujdcu.wdata 


256 


(n 


256-bit write data to tX:U 


dau_dcu_wmask 


256 


In 


256-brt write data mask to DCU 


dau.dcu_wvalid 


17 


In 


Signal indicating valid write data and write mask. 


Outputs to OAU 


dcu_dau_cmdaccept 


1 


Out 


Signal indicating that the DCU has accepted a valid commarid 
from the DAU. 


dcu_dau_refreshcomplete 


1 


Out 


Signal indicating that the OCU has completed a refresh. 


dcu.dau_fdata 


256 


Out 


2S6-bit read data from DCU. 


dctj_clau_rrva!id 


1 


Out 


Signal indicating valid read data on dcu^rdata. 


Outputs to ORAM 


1 1 1 


Inputs from DRAM 


1 II 
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20.1 3.2 Definition of DAU lO 



Table 81. DAU Interface 











Clocks and Resets 


pdk 


1 


In 


SoPEC Functional dock 


prst_n 


1 


In 


Active-low. synchronous reset in pdk domain 


CPU Interface 


cpu_adr(9:2) 


8 


In 


CPU address bus. 8 bits are required to decode the 
address space for this bkxdc 


cpu_dataout[31:0] 


32 


In 


Shared write data bus from the CPU 


diu_cpu_(jata(31 :0}. 


32 


Out 


ConriguFatk>n, status and debug read data bus to the CPU 


cpu_iwn 


1 


In 


Common read/not-write signal from the CPU 




2 


In 


CPU access code signals. 

cpu_aoode[0] • Program (0) / Data (1) access 

cpu.aoode{1] - User (0) / Supervisor (1) access 

The DAU wiH only allow supervisor nx)da accesses to data 

space. 


cpu_diu.8el 


1 


In 


Block select from the CPU. When cpu_diu^selis high both 
cp{/_aefdr and cpu^dataoutare valid 


dlu_q)u_rdy 


1 


Out 


Ready signal to the CPU. When diu_cpu_f[iyls high it Indi- 
cates the last cyde of the access. For a write cyde this means 
cpu^dataouthas been registered by the block and for a read 
cyde this means the data on diujcpujdata is valid. 


diu_cpti_beiT 


1 


Out 


Bus error signal to the CPU indicating an invalid access. 


DIU Read Interface to SoPE 


C Units 


<unlt>_diu_rreq 


1 


(n 


SoPEC unit requests DRAM read. A read request must be 
accompanied by a vaild read address. 


<unit>_diii_radrl21 SJ 


17 


In 


Read address to DIU 

17 bits wkle (256-bit aligned word). 


diu_<unit>_rack 


1 


Out 


Acknowtedge from DIU that read request has been accepted 
and new read address can be placed on <untt>_diu^rstdr 


dlu_data(63:0] 


64 


Out 


Data from DIU to SoPEC Units except CPU. 
Rrst 64-bKs Is bits e3KD of 256 bit word 
Second 64-blts is bits 127:64 of 256 bit word 
Third 64-bits is bits 1 91 :1 26 of 256 bit word 
Fourth 64-bits Is bits 255:192 of 256 bit word 


dram_cpu_data(255:0J 


256 


Out 


256-bit data from ORAM to CPU. 


diu_<unrt>_rvBlid 


1. 


Out 


Signal from DIU telling SoPEC Unit that valkl read data Is on 
the diujd&ta bus 


DIU Write Interface to SoPEC Units 


<unit>_diu_wreq 


1 


In 


SoPEC una requests DRAM write. A %vrite request must be 
accompanied by a valid write address. 


<unit>_diu_wadft21 :5] 


17 


In 


Write address to DIU except CPU. CDU 
1 7 bits wide (2S6-b(t aligned word) 


cpu__adr(21:0] 


22 


In 


CPU Write address to DIU 

22 bits wMe (d-bit aligned word) 

Addresses cannot cross a 2S6'btt word DRAM tKXjndary. 
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Table 98. COU regfsteis 



22^.3 





put of the fl« 8x8 Wock Of the test data 
12 - TSOB outptrtof CS1650. (ndicales the Hrst out- 
put bjrte of each 8x8 block of test data 

I!^^ ^ »»" - "'sP'ays OCT 

coefficients or quantized coefficients depending on 
value of JpgOecTType. f^namg on 



^ ^l;^«>«9-«<W/rrf set indicates that the JPEG 
^^^^J" "^""^ of jdk as the output jp£ 
halftkxkdouble-buffefsoftheCOUateMn 

Bitel 9-16 - fffe.ewrtenis (rfo at faipiit of JPEG 
decoder core) 

CS6150 (see Table 100 tbr description of bit.1 



Typical operation 

The CDU shotdd only be started after the CFU has been started 

yl^«e.. Use«ti;n8CtXS^scTwJ^;^^ BuffEndBlockAdr Nuniuf. 

for the band has finished bei;.;:^r rCT.r^'^ band. When the compressed contone data 
indicating that the memoiy ^>^Jid^^ £^^^ "^"^^ will be sent to the PCU and CPU 
band of contone data. ^ ^ oand is now fiee. Processing can now start on the next 

for restarting the CDU betS^f * * toAfe«ff«„«E£>u,6fe. There a« 4 mechanise, 

nmt band, end tuk7S^Zt^L^nuT^'^'' °^ 
^y..thea.Ustartsp^Xe.S:tbldr^^ 

to^^i^S ^eS-^S::^'^'*'^*^^^*^ to execute connnands ftom 
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tag .gmi. An tatonipt te «», » thTcpu hi^^^' "PW ill" u<i .w«i . res., before stuttoj deed- 



22.5.4 Read control unit 



accesses. 

accesses to DRAM is described in sccion M^Ton it!?M5f p '^f'^' ^^"^ 
by means of the state machine desmb«l K^Tof "^""^ implemented 

it whether to attempt to read a band of c^Sr«^S S.S^eX'^ 'rif" » ^ ^'^^ *^ 
does nothing. When DoneBand is clear^ ,1^- ^ DoneBand is set, the state machine 

up to 256^ts at a time wlSilf S^Ll^^ ^^^a^S^StlT^^ 

knowledge about numbers of blocks or n^nhl!^^ , machine has no 

by consecutive reads fto^ ii^^TS^^S^ ' " ""^^ «FO full 

atleastat^^pea^ORAMreadband^^S^^:-^^ 

'r^-rs; rs»:SL^'r "^v^"^. ^ ^ ^^^^^ access, u is 

dtu_cdu^id being asW «t^o^JJ^ It «»y 
end_of_bandston:: '^'^-^<»^^-'^ « compared to both end_^rce^ and 

..^Ije remaining64.it valuesinthit^^^^ 

' ^J-^elJt'„pi::rto^Sn^^^ «.0<~n:._«^. then 

whether c««.^o««.?a*.als2^etS^^:^T*a^^ T ?' "^''"^^^^ 

FIFO is 0. ^ *oe c/tt/_o/lfeartrf control signd 

«rr^OK/iw_adr is output to the DIU as a/K_rf/«_.rad>: 

A count is kept of the number of 64-bit values in the ftfo wi,-.. ^ j 

0. data is written to the HFO by asserti^ SbwflS ''"'-'^"-rvalid is 1 and ignor^_data is 

incremented. ^ assertmg fjfoffrr, »aAJ^o_contents[3:0] mAfifo_y^_adrp:0] are bodi 

data fiom the FIFO. Note it is ^^y>^^y>tT^f^^T:^^''-^ " *° 
ister to 1. In this case data is sent LctirfSmSriiroSt^ftT'^'! by setting the i?ypa«;/^^ ^. 
decoder is riot stalled Qpg_core statt J^SlTlll^V T!^^'^^ double-buffer. While the JPEG 
a byte of data is ccnsS by JplolSid^ ^^^'^ ^^^^^^^ --^JPZJn^trb are both I. 
next byte. The read address is byte J^T^^^^V^r "^-'^^ " incremented to select the 
"PP" 3 are mput as the read address for the FIFO 
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cdu.dlu.rreq « o 
l0noro.data « 0 



ignore.data • 0 



cdu_diu_rreq b o 
ignoro_data « 0 



< 



< 



idle 



> 



cdu_dJi|.rreqeO 



rcq 



3 



fife OPnfftnts<rTVfifa itir^.^) 



odu_<fiu_#Teq ^ i 
ionore.data •> O 



ack 



odujcfiu^freq = 0 

■ 



read 



> 



Md MUffift mtt 
odu.diu«rreq«o 
KlfioraudatBe 1 



Rgure 101. state maehtne to read compixjssed contone data 



22.5.5 Compressed contone FIFO 



ten to the t^^T^ S^DW e^^^^ ^^^^ ''T'-^^^^ 

When enrf_o/ band =. 1 dtirine an innT^^i^^ r '"'^ ^ " *e last tnmsfer. 

sioa of the same. * roiui^^ew^eft* legistcr is ateo copied to an image ver- 
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«» mu, be ».„ uJ.t Siit ^ ?"b^ to ST"' to each band .r^^! 

22.5.6 CS6150 JPEG decoder 

the CS6150 JPEG decodJSl c^'^ MH. (>UnphioZve staS i2 

which a gated ve«ion of flic systemTock pcik. b^. T techno ogy). The co« is clocked byyctt 
JPEG decoder on a single coJpixXbyScdLS^^^S off T'"'^ * '"echamsa, for stalling fte 
the PixOutEnab input to the JPEG d«LS^ 

block boundary anSis^^d^f^^"?;^^^ ^fLlE"^^^. °' ^ ^^^O 

instead tied high. oorc^,. inus gating of the clock is employed and PUOutEnitb is 

Lfa« (DNL) n^Hcr „ d« ^.^SSy^^^.J'ZS^ t " '*"<>«' 

l«eth aa ttua b a m<KlffliMto „ tt^^* °' »^ " ""WS of »»» am 

11»Mowi,«a„ba«.-.a,d««»ibca»™.„by„Ucta„CS6I50i.».^,.a.b.™i&ri^^^ 
JPEG decoder parameter bus 

mines which internal parameters are diZlZZ on thf ^ ^ t^" ""P"* i-^PgDecPlipe} detcr- 

the />K«/.. port does n'ot contain :;;?oTi£S;° t,' by^S^^o" ^ """^'^ '^"^'^ 



Table 99. Parameter bus definitions 
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Tabic 99. Parameter bus definitions 



0x4 



CsO[7:0LTq0(1 :OLV0f2:01 



0x6 



0x9 



Cs1[7:0LTql(i:0LVir2:0J 
^H1(2:0I 



Cs2R:0LTq2C1.-0J_V^0] 



CsO: idenUfrer for the fiist scan component 

TqO: quantfeation labfe identrfier for the firet scan compo- 
nent 

VO: vertteil sampUng tactor for the first scan component, 
values = 1-4 

HO: horizontal sampling factor far the firet scan oompo. 
nent. Vatum = 1-4 



Csl. Tql. VI and HI for the second scan componenL 
VI. HI undefined if HS<2 



0x8 CSHI1S.-0) 



000_HMAXI2:0J_VMAXP: 
OL MCUBU<I3K^NS{2:0] 



Cs3. iq3.V3andH3forlhe second scan component" 
V3.H3 undefined If NS<4 



OW: restart intefval 



HMAX; raaidmal horizontal sampfing factor in frame 
JJMAfcrnaxmwl verflcal sampfing factor in frame 
JteUBlK number of blodcs per MCU of the cun»nt scan. 

NS: number of scan eamB«o«».. In curent scan. 1-4 



2ZS.6.2 JPEG decoder stafyjs register 

TTiestatiis register flags mdicate the cuirent state of the CS61 50 on«,H«„ wu • . 

ing the decoding process, the decompression procSs S JPEoT^ • 

sent to the CPU by assertinu cdu the JPEG decoder is suspended and an intcmipt is 

Wgh to indicate an em,r condi^ai SL'^Sle^^^^^^ 

is required from the user. If any of Zoth^e^ 1^ ^° °° intervention 

equation, the core wUl Sis^'^ift^St^'^tilX'J^tt^ .^^/"""T^^ ''"^ '''"^ 
more errors. (SOI) without triggering any 

^d^S;'*'^''*^"^'^'''""''"'^'^'^^^"^^^ 
Table 100. JPEG decoder status r«gister definKions 



11-8 



TbfDefI3:0 ] 
OecHfEn^r 



Intficates the numt>ef of quantization tables defined. IbltAtaljIe. 



Set when an undefi ned Huffman table symtx>l is referenced during dec oding 
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Table 100. JPEG decoder status register definitions 



HtError 



QtError 



OecEnrof 



IDctrnProg 



DecJnftnog 



JpglnProg 



Se< when an invalid OHT segment Is detected. 



Set when an tnvattd DQT segment is detected. 



^en anything other than a JPEG marker is Input 

Set when any of DecFlagsf6:4] are set 

Set when any data other than the SOI marker is detected at the start of a stream 
man or quanfization definftion detected. incomplete Huff- 



t^'^'^r.^':^'^'^''''- ^ ^ 'OCT has- 



wgnm nas oeen oulptii from the core and is de-asserted when tha i4<M-n.««» 
scan « complete. It Indicates that the com is i^eSr^JiS^ ^ 





22.5.7 Half.4ilock buffer inteiface 



to stall the JPEG decoder c^TJS^^^T^i^r uri't ™' 
pixel). We provide a mecluuu^ S ^ L ^G^^ ^T^' P"*'' 
Jpgjo*^^ is 1. THe hal«,lockbS?„teSiS^?^^K?'! the dock to the core when 

half JPEG blocks to dewunte ippr- T . responsible for providing a set of double buflfered 

r>^i^i^ZS>l^)'t^^^^^li'T^ ''^^r''} ^ JPEG blocks to 

onlyasingte color plaT^^tiS^eoS^ 

'^^^^:^X^X^^t^,,r^ °' ' ^^'^ '-^^-"ock buffers and sotue sit^le 



jpg_core_8tall ^ 



pixet_data • 



half-blOGk buffer tntarf^'* 



— o<}J 



halMifock buffer 
select unit 



oontone 

re 
5r 



^-.advjiatf^Wbck 
— fd^adv 



half.btock.o<.to.f©ad 



64 



ctfu»diu _data(63:0] 



Figure 102. Block diagram of half-bfock buffer interface 
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22.5,7.1 Haif'bfock buffer seiect unit 

this case, each bufferl a half SSbSi^32'^^^ 

single bit (wr_buff) for the cummt bX ^!f .*^'***^"**°~*="^ 

halfJblock^ok_to_read equals buff ^ "I"* '^'^ °"*P"» value 

Aj#_W/w._Z«^.When;>^co«3^^ yp^eo-^m// equals 

the production of pixels. The clock gating is ^on^in^t^^^J^'r"^ " ^ »° «°P 

outp« from the CDU. When /c/A^i'^,™^ 

</c/*_e««*fe is the inverse ^Tp^coi^^^!,!). yctt:_e««We is o. jdk is 0 

ptcei_caunt[4:0J is sf. Au^ ro^W ^S?^^^? ''^ ""^"^ When 

pir_o«r_v«/WANDed;ith&eiSofS^^ °"*P«* '^-^'^ equals 

ANDed with /rf arfv- of jpg.core_sta!L The output rd_en equals half_block_ok to \ad 



22.S. 7.2 Contone plane buffer 

Bw:t «.ntone plane buffer consists of two half JPEG block buffer as shown 



in block diagram form in Fig- 



<mjaM. 
wr_en" 



JPEG 
half-block bufforo 



rd_en 



ptXBl flata . 



JPEG 
talf<bfock buffer 1 




contone plane buffer 



Figure 103. Contone plane buffer Interface 



lecttd«,toaa shift 4iSr1nSS^SiT^ f -""^ « "-Wt D«. 1. col- 
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22.5.8 Write controlunit 

• in each 256-bit DRAM word 



4line 



ORAM wordp 
DRAM word 



DRAM 



JPEGWockO 
lines 0 to 3 



r 



4Une 



ORAMwMxlp<»4n 

ORAM word q 
ORAM wortJ 044 



JPEGMOdtl 
fines 0 to 3 



255 



191 



127 




JPEGWbckn 
BnesOtoa 



JPEG block 0 
Ones4to7 



JPEGbtockl 
fines4to7 



ORAM word q44n 



JPEGUockn 
iines4to7 



255 



191 



127 




Jjnfe fn one DRAM ro^r. feTSlKS'' 
COU access to ORAM """Bw 



itoORAM 
CX-CofafX 

ly - Un© Y or 8 bytes of a fine In a JPEG Week 
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* «icB g, line 3 in word p+3 bits 63-0 
block 0, coxo. 0. Mne 4 m word , bits 63-0, line 5 in word bits 63-0 ' 

ixne 6 xn word <,.2 bit. 63-0, line 7 in word"" bite SS-'o. . 
block 0, color 1, line 0 in word p bits 127 fid i . 

line 2 in word ^-2 ^Ls l2^^^^ "^'^ '^'^^ ^2^'"' 

Dita 127-64. line 3 in word p+3 bits 127-64, 

block 0, color 1, line 4 in word q bits 127^64 n«« c - 

line 6 in word q.2 b^ts 127 64 if '''''''^ bits 127-64, 

q ^ bits 127-64, line 7 in word q*3 bits 127-64, 

repeat for block O color 2, block 0 color 3 

block 1, color 0, line 0 in word p*4 bics 63-0 im^ i ^ ^ . 

^ "*cs o, line 1 in word p+5 bite 63-0. 

etc 

block M, color 3. line 4 in word q*4n bits 255-192 lin^ s • ^ 

lir^e 6 in word <,.4n.2 bits 255-1 W Tine ^7 ±n ^^'^^^ 
-^^"^ 7 xn word q+4n*3 bit 255-192 
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only 64 bits out of th= asfrwi ^ to n»i T.""* """"P^ <»"toi» da. from S^S' 
-»™M^.CC«»,„4c.t«rSDSS"orJLr°'^ 

block to DRAM. Once the haJf-block biiffZ^IZ^ T1 " *° *^Pt »«> write a half JPEG 

requcstsawriteaccesstoDRAMby^S'cSr^v^"^ * "^^^^^ "^^'^^ ^ate^^S 
ing to the firet 64.bit value to be wriiwT^r^ providiag the wiite address coi«««nH 

access of 4x64 bits is^^ by ^Su °T^f55r-r^ ^"'^^ fim^S 

fourth 64^it values). The state LchSe^^afjyo 1^ ^ ^ ^5 

mg a read of 4 64-bit values fix.m the hal^WoTl^er^S^ acfaiowlcdge ftom the DIU before initiat- 

put cdu_fiiu_wvalid is asserted in the cycle aS S^^?^ ^ '^'^^ ^^'^ ^ rb^ out- 

thc c^_^,„_^ should Z^^Zt^^^i^^^r" '° l!.^'^ « on 

.s then sent to the half-block buffer inte^2t^^TZ^'^''^ f''^-'^-f-'(r~A'ockpul^ 

should now be avaflable to be written to ag^ S,^^ ""^^^^ has been read and 

TTe pseudocode below shows how the writ adZ *° "^"^ 

and flags should be ttej^^^l^^^"' ^ P"* cycle basis. Note counteis 

cleared and />.Jia(^^^r^:^SJ^- 2^^^^^ » f counters and flags sh^^^ 

l»f£Lftan_fub-t-maxJ,lodci^ I *«iL^tot.ad'r and uprJuU/block_adr gets loaded with 

Cdu *r '^^J'", '^'P'"^ to DRAM 



if (half .« 1) than 

// for block 



// for lines 4-7 of jpfiG block 



// update half, color, block — *^ 

it .<rd.a<iv.h.lf^iock := i,The„ '^RAM write 

if (half 1) then 
half = 0 

if (color == maxu»lane) then 
color 0 

if (block == max block) then 

pulse wradvSline ' writing a line of JPEG blocks 

block a 0 

// update half block addresa for start «f . . , • 

// account of address wrapping L er^u^frT^/'"" """^^ '^-'^^-^ 
xf (upr halfblock^adr buf^end aS Jh^''"^'"' * 

upr.halfblock adr « buff atari- -J^ 
elsif (upr_halfbIock_ad^^"^;::"^^-:f / -r-i'?^^ 
^^^upr^halfblock_,adr = buf f^strrtirdr f-end^adr) then 
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else "*"^-*»i">l«'«^'^»<»^ = upr_h«lfblock_«<ir * ma^block ♦ 2 

block 

up..K«in,aoo^a.r „ ^^^^^ 

color ++ 

elfie 

half = 1 

if (color == maxjplane) then 

if (block =. ««_blocX, then // end of writing . xin. of opbg bloclcs 
// update half block address for start of -i ^ 

Iwr^halfblocK-adr - buff^start^dr «^en<l.adr) then 

else 

lwr_hal£block.adr lwr_halfblock»adr * max^block +2 



olse 

lwr_halfblock_adr 



// move to address for lines 0-3 for next block 
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cdujdHujwreq » o 
Mjdv_half_btodcoO 



cdu_dfu_wreq « 0 
Gdiccfhi.wvalU > 0 
rdLadv « 0 

^ reset ^ 



idle ^ 



c 



rd.adv « o 

nCadvcha«Lbiock c o 



req 



> 



ngff work ftk to 
nljBdv_lialCbteck»0 



ack 



c 



3 



odu_diu_«wBJtd « 0 
rdLadv « i 
nCa<*v_haJ(_bk5ck- 



read 



c 



cdu_dlic«vroq«0 
odujdiu_«vvaOrf e ' 
fd_adv«i 
nJ^adv_haftLblocl( 



write 1 



c 



3 



0(lLLjdiu_%yreq •> 0 
cdu_dlM.wvafid « i 

rd.advjrialUilock- 



write2 



cdujiiiujmeq « 0 
cdu_df u^wvaud « 1 
rd.adv ■ 1 

rd^advjharLbfock-l 



writes 



3 



odu_dlu_wreq c 0 
cdu.diujMvaJid « 1 
rd_adv«0 

nj.adv^HLblock»0 



write4 



> 



odu.diu.wreq b q 
odu.dtu.wvalid » o 
fd^adv B 0 

(^.Bdv.haff.btock a 0 



Figure 105. State machine to write decompressed 



contone data 
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S5 



The contone line store ZVZ pro^s ZZ^il^^?'?''' "^ *t "T" line-at-a-dme. 
DRAM. ^dproyi6^siea^^^Z^S^en^^^'°l^^ "^^^^^ ""nber of lines ston^i„ 
^ 6 cannot be read Ironiuntd the complete «ne has be« Wit. 

write to. n>us the size of the line storein dSS^.?^^ ^"""^ C^U to 

line sto« interfece is 8 lines. ^ o5^g a^^bS^^ sSemT"^ ' V- "^f »i^e of the 

scheme while 16 lines provid*^ a doubLbS scSe ^ «zes -« 12 lines for a 1.5 buffer 

?o^t D^"r"a^^4s: to t't^te^^r^^f^^^r'-^j^ 

set to the value of ««, buff T?.e cS?^,i « Y^! ^ transitions from 0 to 1. numjines^avail is 
available for 8 lines. in!^£T^hJ^,^L'^°^^^ T**" ^ " « ^^ce 

^vriti„g 8 lines, the v^ite conZl^ ^^'^1^^^;^''-^ ^'l " ^^'^ 
CPU. and ««m_//>,«^rirdS^«,^"t r^'^l^r v"^^^^^ 

*«e^/iw_o*_,o ivrf/eto beset aeain^^mic^ J^r "^^^ for 
Priatdy. and sini its o r^SlE^ SSSSu?"'^^ for responding to n^rf^/Me pulses appro- 
it finishes teading them. n^^^^-r^'L^^ 
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23 Contone FIFO Unit (CFU) 



23.1 OVERV^fEW 



color inversion in up to 4 color planes and7hZ^ fL!i !! ^ foUowed by optional 

23.2 Bandwidth requirements 

tion is pcrfonned by the CFU ^^g^h^c^nl^l^T ^"^^ "^^^ "P"*^'**" in the Y dircc- 
DRAM. The HCU generates I Subf-Sin ..r^ ^'^^ f^'*^ ^ the Y-scale factor, from 
1 side per 2 seconds^or^b leS a^XS p^^^^^ -^^^^ ^ Print speed of 

color contone pixel (32 bits) eSn.lp^'^Su^rr f buffo needs to be supplied with a 4 

from DRAM ^ 533 blts/cyS ^ ' * PPi the CFU must read data 

23.3 Color space conversion 

and K, directi; represent^ by ^ S fo^ '""^ ""^^ ^' 

muIti-SoPEC prii^g with ex^« col^ ""^ '"^^"i^ etc. for 

S^^^SS.'S'^'^^ilr ^J'^^^-'y '-ninance and ch^ndnance 
luminance infoLtiSand « wj^^n^ot^^^'^^^ be luminance, but C. M and Y each contain 
J^providethemeansbywhich^^-t-S^^^ 

to CMY ""Passion, the YCfCb data is obtamed. then color converted to RGB. and finally back 

«. ,K,m»lked lo occupy rfl 25« levels of „ 8-4 biSyl^^ ""'^ ««»P"^ V. Cr ««l Cb 



1 . 32 bits / 6 cycles = 5.33 bits/cycle 
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?TSr;ra^%^^°r^ "''^ colorspaccconvertor is oneof: 
1 coior plane, no color space conversion 

• 2 color planes, no color space conversion 

• 3 color planes, no color space conversion 

• 3 colorplanesYCiCb. conversion to RGB 

• color planes, no color space conversion 

. ^jolor plan^ YCrCbX. conversion of YOCb to ROB. no color conversion of X 

23.4 Color space inversion 

may be used to provide planar correlation o^ZZ^^"^'^ to CMY may be finalised, or to 
^ C^=?25?.^I^ conversion is given by the relationship: 

• M = 255-G 

• Y«=255-B 

Tk^^^on^ps «qu« the page RIP to calculate the ROB from CMY as follows: 

• G-255-M 

• B«=255.Y 

23.5 SCAUNG 

sentcd by a numerator and a denoS^rbSv^!^'^ non-integer scaling with the scale factor repre- 
should be greater than or equJ to SHSoS^tT^^ J """'.^^ ^--^or 
thenMnien^or is prognun^L as 5 a^d'rSS^^;^^^ 



if (count ♦ denominator - numerator >= n) 



else 

iinh = . ^ tor 



count = count ♦ denominate 



advance = o 
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23.6 LEAD-IM AND LEAD-OUT CLIPPING 

block n below) wUl be the last JPEG bV^^ tt^: ■ . ^ ^ boundaiy of the 2 SoPECs (JPEG 
line printed by SoPEC #I.P^ek1n,S^^.E^^^ 

ately setting L WO««^Si"iS^i3'^Jf ^'^^^ « «e ignored by appropri- 

at the beginning of each hJ't^ n^o{^^1^o^'^T^7/"?r ""^ 
Leadlna^Num register. ignored at the start of each hne is specified by die 

It may also be the case that the CDU writes nut iham. idc*" ui t . 

as shown for SoPEC « bel^.2^^ Je X JJ^eSS/^ " "T^t"" C''"' 
spond to JPEG block m but the val Jf^^ie ^f^'*^ m the CDU is set to cone- 

block n^us JPEG bSSc « istoTrSi* bySSJ;"^^ m the CFU is set to correspond to JPEG 



SoPEC «1 
lead-in area 



SoPEC #2 SoPEC #1 
tead-Ki area ^ lead-out area 



SoPEC #2 
lead-out area 




SoPEC #1 prints left 
Sfde of page 



SoPEC #2 prints right 
side of page 



Figure 106. Lead-in and lea<l.,ut clipping of contone data In multl-SoPEC environment 

^ng :?tS?£*vS!d^:^^^^^^^ are s^aed up to the printer's resoluHon. T^e 

Ungth register defines the^Se of tet^S "T"^ ^'««^'"« '"S*^- The HcuLine- 

m,Isthescalingofthelas:X^?e STfuS?:.^^^^^^ 
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23,7 iMPLEiMENTATION 

Figure 107 shows a block diagtam of the CFU. 



S5 



DRAM Interface Unit 



deoompressed 
oontono buffer 



_2/ W jHlff. fd_tXiff 



^ 2v wr_en.fd_en 



_ Contone 
Decoder Unit 



>,7 



Y-scaiIng 
control unit 



K Ct) Cr 

color space converter 
cp3 cp2 _cp1 cpO 



^8 



8 



8 



YCfCb2RGa 



<"wt_oolof_i>tane 



. - I I - 
t I 



oonfiguration 
registers 



output 
doutile-tNjfler 



wr_ea Id jBft 



16y 
E 



^ ^ ^ ^ f f 



8 ^ 



3/3 



^ linsS ck 


^ 

to read 


, V 

contone 
Hne store 
interface 





X-GcaGng 
control cintt 



"32 



'8 , 


'a . 




1 


1 

o 

I' 





Halftone/Compositor Unit 



a 

1 



Contone 
FIFO Unit 



PEP Controlkr Unit 



Figure 107. Block diagram of CFU 
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23-7.1 Definitions of I/O 



port list and description 



Clocks and reset 



pdk 



pfst.n 



PCU fnterface 

pcu_cfti_sel 



pcujrvm 



pcu_adf(6:2] 



pcu_dataoutf31.-0] 



cfu_pcu_idy 



cfuj>cu_data[3l;0) 



OIU Intertsee 



cfu_cfiu_rreq 



dni_cfu_jack 



cfu,dlu,radf(21:S] 



<iiu_cfu_fvartd 



cfuLcdu^rdadvfine 



HCU Interface 
hcu_c<u_advdot 



cfu_hcu_avaif 



cfu_hcu_cOdataf7.'0] 



cfu_hco,c1 data{7.*0) 



cfu,hcu_c2data(7,-0] 



jft'^l>cu^c3data[7.-0] 



I System dock 



System reset synchronous actfve kw. 



32 



32 



In 



In 



In 



In 



Out 



Out 



f^.^!'^/""" P<^«<.c^-se/is high botti 

Common read/not-wrfte signal from the PCU. 



PCU aaaress bus. Only 5 bits are required to decode the" 
address space for this btock. 



Shared write data bus from the PCU 



mTZ ^. S J! "^1^ <=^-Pcu^^ ^3 hroh it indicates 

^ i.^^ registered by the bkx* and for a read 
cyde this means the data on cfti.j?a/_date is valid. 



Read data txjs to the PCU. 



.1 



17 



Out 



Out 



In 



janied by a valid read address. 



^c^edge from DiU. active high. Indicates that a read 
requ^h^been accepted and the new read address can be 
placed on the address bus. cfujcSlu^radr 



CFU read address. 17 bits %wde (256-bit aligned word). 



Read data vafid, aclh« high. Indteates that valM read data is 
now on the read data r ^ " ^ 




Out 



«rf^.H.^ data to the dr- 

a^ buffer In DRAM and the data is available to be read by the 



t «f <'ecompressed contone data to the drcular 
buffer m DRAM and that One of the buffer is now fr^» 



in 



Out 



Out 



Out 



Out 



Out 



rnfomis thje CRJ that the HCU has captured the pixel data on 
pixel on ttie data lines. 



Indfcates vafld data present on cfu,hcu cf0 >3kte^i;;;r" 



Pixel of data in contone plane 0. 



Pixel of data in comone plane 1. 



Pixel of data In contone plane 2. 



Pixel of data in contone plane 3. 
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23.7.2 Configuration registers 

The configumtion registers in the CFU are promammed via the PPT I or. 



Table 102. CFU registers 




Writing 1 to this register starts the CFU. Writing 0 to this 
register halts the CFU. 

When Go is deasserted the statennachines go to their 
Idle states but ail counters and configuration registers 
keep their vaiues. 

When Go Is asserted ad counters are reset, injt configu- 
ration registers keep their values (I.e. they dent get 
reset). 

The CFU must be started before the COU is started 
This register can be read to detennMe If the CFU Is run- 
ning 

<1 • running, 0 - stopped). 



0x20 



MaxBtodc 



BuffStartAdr 



13 
IF" 



BuffEndAdr 



4LineOffset 



15 



0x000 



QxOOOO 



0x0000 



13 



YCfCb2RGB 



0x0000 



0x0 



Number of JPEG MCUs (or JPEG block equivalents. I.e 
8x6bytes)inanne-1. 



Points to the start of the decompressed contone circular 
buffer in ORAM. aCgned to a hatf JPEG bkx* boundary. 
A half JPEG block consists of 4 %vo(vl3 of 2564x13 
enough to hoW 32 contone pbtels In 4 ook^rs, l.e. half a 
JPEG block. 



Points to the end of the decompressed contone circular 
buffer in ORAiy^, aRgned to a half JPEG bk)ck boundary 
(address is inclusive). 

A half JPEG bkx:k consists of 4 words of 256-bits 
enough to hold 32 contone pixels in 4 cotors, i.e. half a 
JPEG bk>ck. 



Defines the offset between the start of one 4 line store to 
the start of the next 4 line store. In Rgure 108 on 
page 294, if eufSea/t4dr corresponds to line 0 trfock 0 
then BuffStartAdr 4UneOf^etoonesfK>nds to line 4 
block 0. 

This register is required in addition to MaxSitoc^as the 
number of JPEG bhxics in a line required by the CFU 
may be different from the number of JPEG blocks in a 
line written by the COU. 



Set this bit to enable conversion from YCrCb to RGB. 
Should not be changed between bands. 
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Table 102. CFU registers 




0x2C 



0x30 



0x34 



0x38 



0x40 



0x44 



HcuUneLength 



LeadlnQIpNum 



LeadOirtClipNum 



XstartCount 



XscaleNum 



XscafeDenom 



YscaleNum 



YscareOenom 



0x0 



0x0000 



0x0 



0x0 



s^bisir '^'^ ^^'"^ ^ « ^^'"^^ 

bito - 1 invert cotor pfane 0 

- 0 do not convert 
bitl - 1 Invert cofor plane 1 

- 0 do not convert 
blt2 - 1 Invert cotor pfane 2 

- 0 do not convert 
bits - 1 Invert cotor plane 3 
Should not be changed between bands 



Number of oontone pbcete - 1 in a fine (after scaling) 
Equals the number of /wi/_cft<.<toted^ pulses - 1 
received from the HCU for each line of contone data 



Number of contone pixels to be Ignored at the start of a 
nne (from JPEG btock 0 in a line). They are not passed to 
the output buffer to be scaled in the Xdirecfion 



0x00 



0x01 



0x01 



0x01 



0x01 



Number of contone pixels to be igriored at the end of a 
line (from JPEG btock MaxBlock in a One). They are not 

passed to the output buffer to be scaled in the X direc. 
tion. 



Value to be toaded at the start of every nne Into the coun- 
ter used for scaling in the X directton. Used to control the 
scaRng of the first pixef in a fine to be sent to the HCU 
Thrs value wilt typically be zero, except in the case where 
a number of dots are dipped on the lead In to a line 



Numerator of contone scale factor In X direction. 



Denominator of oontone scale factor In X direction. 



Numerator of contone scafe factor in Y direction. 



Denominator of contone scale factor in Y directton. 



23.7.3 



Storage of decompressed contone data in DRAM 

The CFU rcads dccompressed contone data from DRAM in single 256-bit accesses JPFH Ki«.i.. 
decompressed contone data are stored in DRAM with accesses. JPEG blocks of 

is in order to optimize acc^sTJofrSs by anangement 

m each 256.bit DRAM word, ll^e mS.S'^^c^t^^ IT '""^'^ u 

256-bit DRAM access. uic u reaos 04-Dits m 4 colors from a smgle Ime m each 
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Si 



4 line 
Store 



ORAM wordp 
ORAM word p«4 



DRAM 



4rine 
store 



DRAMvwMdpMn 
— ORAM word q 
ORAM word 



ORAM word q44n 



JPEQblodcO 
lines 0 to 3 



JPEG Week 1 
iines 0 to 3 



JPEG block n 
in«sOto3 



JPEQbCockO 
Bnes4«o7 



JPEQblockl 
iines4to7 



JPEG Mock n 
llnes4to7 



255 


191 


127 


63 0 


C3^0 


1 C2L0 


1 CILO 


1 COLO 1 




i 

!. C2L1 


1 C1L1 


1 C01.1 1 


C3^ 


f C2L2 


♦ . 

1 C1L2 


1 C012 1 




UC2UJ 


» CI La 





wordpfS . 



255 



191 



127 



C3^4 J C2L4; 


. Q1U 1 




C3^ 1 C2LS 1 


C1LS 1 


COLS 


csya 1 c2Ljb i 


C1L6 1 


C0L6 


C3^7 1 C2L7 1 


C1L7 1 


C0L7 



wordq 
word q>1 
word 0^2 
wordq^S 



Impoes one 256 bit read of a word ki DRAM 



CX-CoforX 

LY-Une Y or 8 bytes of a line in a JPE6bkx:k 



Figure 108. DRAM storage airangement for a single line of JPEG blocks In 4 colors 

sequence, as shown in Figure 108, is 



S'fouSi^^ <lata line at a time in 4 colois from DRAM The read 



line 0, block 0 in word p of DRAM 
line 0, block 1 in wrd p*4 of dram 

line 0. block n in word p+4n of DRAM 

(repeat to read line a number of ti«es according to scale factor) 

line 1, block 0 in word p+l of ORAM 
line 1, block 1 in word p+5 of 



etc. 



DRAM 



^tS'on^riaSrSi? '^^Zc^^l^^Z''' f-tor nun^be^ ^ «„es DRAM before it 
becomes available for tte CDU to ^teT ^ ""^^ ^^""^'^^ ^» «ore 

23.7.4 Decompressed contone buffer 
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23.7,5 Y«scaling control unK 

DRAM in single 256-bit accesses. iJSeiviag ftiS?* JSn r ^""'''^ '^'^ « from 

TTie protocol and timing for read acces^to DR^M^f w 1^.°^" * "'""'^ ^^'^'^^ (^'^^ cycle). 
-cssestoniUMrS^lement^Tyrj^^r^t^Ll^^^^^ 

buff.ok_to_y>riie flags to teU t^S.« tf at^lm^^ "'^ Une8_pk_fo « JS 

When ,o W To nS^^ ? ^L'°*°^*^'°P^~'««'»<»«^ 

nK«arinccontinJ^s7S^*^Sd^^2lT**"T > tte^ 

space available in the buffa ^"P"^*^ ^^'^^^ •'^^^ 

<^^i,'?s^:^ujLt't5S*^^^^ 

that writes a« to occur to. ^'°*^'^^"-""***"'«»«Wt(w-_Aw^forthecuiie«Siffer 
of data ftom DR^vi to the biff!:^^:^^^^^^^ IH?" ^ 

rd^ and rO«/ gets incrcLenK^i^rL vi^^tl"^^^ »^ 

'mte the data to the output double4,Sffer of Sc CW^S^-i.^ f to 

bin »<i«'--is asserted. *,tf:.«va///>»L*«^ts^^^ buffer. nO./ equals 

^ P"-X"~'o^:^s^^ro::s^^ — 

<K«ction is thus perfonned. «=«™P«ssea contone data. Scahng to the printhead resolution in the Y 

// assign read address output to DRAM 
cdu_diu_w«dr[21:7) = curr_l«ilfblock 
cdu_diu_wadrt6:5) = linetlsOJ 

" ir^"_."°t:«":rx"-'"^^--°""' ^ -'^ -^-r eac^ DRAM read access 

" .iLT/o '~-in, a U„e oe contone in up to . coXors 

// check whether to advance to next ^« ' 

if <y_scale_count . y.scale.d^n^r! y Zell IZT^f"^ ""^ 
y_scale_count . y_scale_count * y ^c^re 52)1, 
pulse. RdAdvline - ♦ y_sc«le_denoin - y_scale_nun. 

(line « 3> .Hen . ^ ^^^^ ^^^^ 
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J3 



line 



curr_h«lfblocJc = buf f.etartladr 
line_st«rt_adr = buf f_start_adr 

Iine_st«rt.adr = buff^start adr 
else ~" 

curr_halfblock = line_start_«dr ♦ «n„e offset 

else 
line 

else = line_st«rt_adr 

// re-read current line from dram 
y_acale_count = y_scale_count ♦ y scale d-,.„™ 
^^-l-'iaiock = line_start_adr 

block ♦-»■ 
curr^haZfblock 
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cfu_dlu_n'aq « o 



cf u^diu.rreq e o 

wr_8el « 0 
wr_adv_buffaO 



idle ^ 



C 



req 



c 



> 



guff OH m wrfm 

cfu_dlu.rreq s 1 
wr.adtcbuff^o 



ack 



c 



3 



«vr_8el«0 
wi^adVLbuff-o 



readl 



c 



cfu_dtu _rreq « o 



read2 



c 



reads 



c 



3 



cfii_diu_rr&q = o 
wr^seJ = 2 
wr_adv_buff «= 0 



read4 



> 



Figure 109. State machine to read decompressed 



contone data from ORAM 
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23.7.6 Contone Itne store interface 



S3 



23.7.7 



DRAM -hentheCDUlJ^SnttiSrrriX^^ ''"'y "-^^m 

Imes it sends an c^«-c^-hWW?//«?X to 8 
CFU may continue reading from DRAM as lono^ huffr ^"■^-'y^-'^'' «s incremented by 8. The 
set while buffili„es_avaul great« Am^O h r*^*^/ *^ '^««-'>*-i'W i! 

ftom DRAM, the Y-scaling SSSm^^il^^^r ^^^"^ «^ « line of contone daS 

CDU to free up the line inL InSerTlSj* ""^^^^ ^ interface and toT 

v//«epulse. "°*^*^-'*^-««^'«*««mentedbylonieceM 

Color Space Converter (CSC) 
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SI 



Figure 1 10 shows a block diagram of the color 



space converter. 




— 



H?m7:!i — 



cpO 



>cp3 



YCfCb^^GB 



oiveft.coior.plane 



23.7.8 



Figure 110. Block diagram of eolor space converter 

version is implemented as followT^ accunwy u maintained with 1 8 bits. The con- 

• R*'=Y + pS9/256XCr-128) 

• G' - Y . (1 83/256XCr-I28) - (88/256XCb-128) 

• B*-y+(454/256KCb-128) 

X-scaling controf unit 

?^S^n^,^ni^^^^'-^^^^^ 

the mechanism for keeping track of ti^ ™nt^,T ^JV^ resolution, provides 

read from untU it has been written to '^"^ *^ « '»"ff"'^r be 



1. - 179 is saturated to 0 

2. 135.5, with rounding becomes 136. 

3. -227 is saturated to 0 
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< « / , 



if (wradv == i) then 

" nf^^-f^-"""" " <"»*-blocX.blll)) then 



pixel_count = 0 
else 



Pixel_count ♦+ 
if ((Pixel.count < leadin_clip_num) 

OR (pixel_count > ({max blocir kiii\ i ^ 
wr^en = 0 » inuuc^Dlock, bill) - leadout.clip^numj ) ) then 

else 

wr_en « i 

HCU that data is avaUable tote .S^C ^T^U 1. this indicates to the 

indicate that the HCU has captuwSXS d!,?^^ * ! HCU responds by asserting hcu_c/u advdot to 
the n«ct pixel on the data iST^ <^-hcu-c[0-3Jdata Unes and the CFU c^n'nowSiS 

i?<Srpr«ttrHS^r.j:^ ^-.-theXdi^tiontoprodoce 
algorithm for non-integer scaling^L Zm^Tj^^ is^plemented by pixel replication. Tie 

loaded ^iS^x^tart_c<>unt after^et^d atAe^nd o?!!^t^ should be 

fixst pixel is scaled by. hcujinejeng^^kt ^ ^X^tJ^ -'•i^^^ t^e 

line that is sent to the HCU is scaledby. '^-'^-'^"'^ ^^trol the amount by which the last pixel in a 



If <hcu_cfu_dotadv ij th«n 

>^scal«_eaunt = 9(_sc«tla 
rcJ^en • 1 . 

se 



incu_cfti_dotadv X) then 

r<L.en « 1 . - xuacaae^denora - aeecale^num 

else 



else 



X_scale_count » x^scale count 



r^irl"" ^"^^ ^-<^n[r,J^ is Cleared, and r^J^^ 

«ce.ve4 then a pulse is gemated to-prentTi J^^J « 5, Ac^.c^. pulse is 

«»et to 0 and x^cale_counr is loaded ynib x7t^n_^uT " ""'P" '^<"-<^-<=ount is 
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24 Lossless Bi-level Decoder (LBD) 



24.1 Overview 



pass-through mode is provided for riT^^^lrT: ^ "^««« DRAM access is available. A 

50:1. Lossless bi-lcvel compression .^Tl^^rZTZT! '^ text compresses with a mio of about 
which compress poorly. « 20:1 with 10:1 possible for pages 

unit) rora^nc.ts^cin,^pJZspVeZX.^D^ ."^^ (Halftoner/Composit,^^ 

is used by the PCU and is aJlable £^'SL^oTc^ " control 4 that 



ORAM 
Intwfaoe Unit 



Ibdjinishadband 



PCU 



LBD 



Spot FIFO 
Unit 



MCU 



Figure 111. High level block diagnm, of LBD In context 



24.2 Main features of LBD 



Doc; SoPEC_hatclwarG_desIgn 
Version: 2.3 



Figure 1 1 2 shows a schematic outline of the LBD and SFU 

the LBD in SoPEC can run much faster thS LTS "nxerefore 

processing latency, to be absolSL ^ " "^'^^^S <»ue to band 

grammed number of bits ^c^tr^^SL^i. " ^'^d °f ""^e or for a pre-pro- 

length code. foUowed by p^^^Ig " ^"'^^ i* always executed Z a 51 

s^LTdrCir;^^.^^^ Spot PIPO Unit (SPU). ™s 

lines up to a prognumnable nlber^iiS^-^^sm-TT^r "^^^J^f" "^^''^ " """^ly 3 
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i3 



A signal sjitjdbj-dy indicates that both the SFU's NextIJn^Pn?n a n r 

wntmg and reading, respectively. NextUneFIFO and PrevUneFTFO are available for 

Kbytes of storage. i-/KJ)ytes ot storage. An A3 Ime of 19488 dots requires 2.4 




ORAM read 



All FIFOs are 64 bytes 
(twice the ORAM data 
word width) 



SFU 




next.llne HFO prevjtne 



currjune 



ORAM Write 
ORAM read 



HCU 



Figure 112. Schematic outline of the LBD and the SFU 
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24.2.1 



S5 



Bi-level Decoding fn the LBD 



Table 



encodin gs 

_.„.Jto 

1000 




tbanorequaIto31. ^"*^~^"""«*™lenethnm-lengthwithanmofless 




CO o 

4 



RRRRR1 



RRRRRRRRRR10 
RRRRRRRR10 



RRRRRRRRIO 



fRRRRRRRRRRRRRHOO 



RRRRRRRRRRRRRRROO 



Short Black Runtengtfi (S bfts) 



Short White Runlength (5 bita) 



Medium Btack Runiength (10 bfts) 



Medium Whfte Runlength (8 bits) 



Mecfium While Runlength with RRRRrrrr . 
Enter pass through 



= 31, 



Long Biack Runlength (15 bits) 



Long White Runlength (15 bits) 



•he nght to most significant bit at the left). > 04 are read in the same way fleast significant bit at 

pass the data to the LBD as un compnSd daTp^ S^.h T*"*"" ^ ^ie^ to 

mented in the PECl venion of the LBD yxnZlZlT^J^..^ * that was not imole- 

the data stream is an ua^omp^ssed bit" -SifS 



24.2.2 



DRAM Access Requ2r«ments 

Table 105, DRAM bandwidth requirements 



Direction 



Read 



Maximum number of 
cycles between each 
256-bft ORAM access 



256^ (1:1 compresston) 



Peak BandwCdth 
• (bits/eycia) 



1 (1:1 compression) 



Average Bandwidtti 
(bfte/eycle) 



0.1 (10:1 oompresskm) 



T^i^ I - V.I (ion CO 

1 : A, 1:1 compassion the LBD«,p«« 1 bil/cycle or 256 bits ev«y 256 cycles' 
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i3 



24.3.1 Definitions of lO 



Table 106. LBD Port Ust 



Cloclcs and Resets 



praL.n 



In 



Global reset signal. 



odu_endo<bandstore(2t :5] 



cdu_8tajtolbandstore(21 :5J 



17 



In 



In 



Addressoftheendofthecufrentbandofdata. 
25$.bit word aligned DRAM address. 



LBD finished band s ignal to PCU and Intemint nnn w r.. " 



17 



64 



Out 



Out 



In 



panled by a vaPd read address. 



Read address to DIU 
1 7 bits wide (256^)ft afigned word). 



^^a^T^^ from UlU that read request has been" 
^^^^^ ^ *^ 



Data from DIU to SoPEC Units. 
Rrst 64-bits Is bits 63.-0 of 256 bft word 
S«»nd 64-bits Is bits 127:64 of 256 bit word. 
TOid 64-bits is bits 191:128 of 256 bit wort. 
Fourth 64-bits Is bits 255:192 of 256 bit word 



pcu_iwn 



Pcujbd^sel 



tt>d_pcu_rdy 



Out 



In 



In 



Read data bus from the tBD to the PCU, 



Common read/notHvrfte signal from the PCU. 



^a^^^^^^w P«c««Lse/is high both 

pcu^aacu-and pcujOataout are valid. 



^^^^7^'^'?}!' "^^^^ ^^-Pcu^rxiyls high it Indi- 
cates the last cyde of the access. For a write cycle this 

torTrL^fjSf '"'^ -flustered by th^block and 
tof^a read cyde this means the data on itxLpcu.Oataln is 



In 



available for reading and is also ready to be written 
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Table 106. LBO Port Ust 




Write data for next Kne but ter. 
Write data vafld signal for n ext llhe htrfter h.,. 
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24.3.2 Configuration Registers 

Table t07, LBD Conflguration Registers 




setup regfeier»c<^nstant far during processtngthppa^ 



A write to this register causes a reset of 
the LBO. 

This register can be read to incJicate the 
reset state: 
0 - reset In progress 
1 ' reset not in progress 

Writfng 1 to this register starts the LBD 
WrWno 0 to this register halts the LBD 
The Go register is reset to 0 by the LBD 
when it finishes processing a band. 
When Go is deasserted the state- 
TOchlnes go to their idle states but all 
counters and configuration registers keen 
J their values, ^ 

I VVhen Go Is asserted all counters are 
reset, but configuration registers keep their 
values (l e. they don t get reset). 
T^ LSD should only be started after the 
SFU is started. 

This register can be read to determine if 

the LBD'is running 

(1 • running, 0 - stopped). 



0x08 



OxOC 



09(10 



UneLength 
PiassThroughEnable 



PassThroughDotLength 



16 



1 . 



16 



Woric registers me ed to be set up before om^ »; . ..^i:^ 

I Ne)ctBandCurifleadAdr(2l:5J 
(256-blt aligned DRAM address) 



QxOOOO 



Oxt 



0x0000 



Width of expanded bi4evel fine (ih dots) 
(mustbeamuitipteof 16 bits). 



Wrfttng 1 to this register enables pass- 
thiough mode. 

Writing 0 to this register disables pass- 
through mode thereby making the L6D 
compatible with PEC1. 



Number of dots for which pass-through 
mode win last. If the end of the line is 
^ached first then passthiough will be disa- 
bled. 



0x18 



NextBandUnesRemaining 



17 



15 



0x0000 
0 



0x0000 



Shadow register whteh is copied to 
CurrReaMarvuhen (NextBandEnable =r f 
<5 Go = 0). 

NextBandCunfteadAdr\s the address of 
tfie start of the next band of compressed 
tx-level data in ORAIVI. 



Shadow register which Is copied to Unes- 
Remaining }Nhen (NextBandEnabie 1 & 

NextBandUnesRemaining ts, the number of 
lines to be decoded in the next band of 
compressed bi-levei data. 
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Table 107. LBD Configuration Registers 



NextBandPrevUneSoufce 



0x0 



0x0 



Shadow reglsief wtuch is copied to Piw- 
UneSoufce when (NextBandEnabte f 

i(GO:=eO). 

1 - use the previous Une read from the SFU 
for decoding the first Une at the start of the 
next band. 

0 - ignore the previous Une read from the 
SFU tordeootfng the first Sne at the start 
of the next band (an all 0*s ifne Is used 
instead). 



If (AtexteandEna/Ve =» 1 A Go c= 0) then 
-NaxtBandCufTReadAdrls coined to 
CunRaadAdr, 

-NextBandUnesRemalf^ fs copied 
to UnesRemaJning, 
-NMBandPfBvUneSouroe is copied 
to PmvUneSoarce, 
'Go is set. 

-NextBandEnabfe H cleared. 



1 Work registers (r 
j 0x24 


ead only for external access) 
CurrReadAdftat.'SJ 
<2Se-blt afigned ORAM address) 


17 




snoud bo set 

The current 2564>A aligned read address 
within the compressed bHevel Image 
PRAM address). Read only regtstef. 
Count of number of Hnes remaining to be 
decoded. The band has fmlshed when this 
number reaches 0. Read only register. 


0x28 


UnesReniatnlng 


"ti 




Qx2C 


PrevUneSource 


1 




1 - uses the previous Hne read from the 
SFU for decoding the first fine at the start 
of the next band. 

0 - ^^nores the previous line read from the 
SFU for decoding the first line at the start 
of the next band (an all O's fine is used 
instead). 

Read only register. 


0x30 
0x34 


CurrWriteAdr 


15 




The current dot posiifon for %vriting to the 
SFU. Read only register. 




HrstUneOIBand 


1 




Indicates whether the cunent line Is con- 
sidered to be the first line of the band. 
Read only register. 



24.3.3 Starting the LBD between bands 

P">«-»<^ -th a stm address for the compressed 
and then stops, clearing ifs Go bit and iss^T^i^^J^ '^^'^ * single band 

for restarting the LBD between bands: 
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^''^ht'f^frpu'^'^^ '° "^^ stopped and cleared its 

Go bit The CPU reprograms the LBD. typically the NextBandCurrReadAdr XSw. 

commands from DRAM. The LBD will have restoitedS the tiiStiiTSw ^ 
mands from DRAM Th/. prn i j «««iw« me nme ine FCU has fetched com- 

Ban^^e f^^^ '""^ ''^^'^ «8i«en. and sets 
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24.3.4 Top-level Description 

A block diagram of the LBD is shown in Figure 1 1 3. 



J3 



ORAM Interface Unit 
—XT 



v64 



07 



IossIbss bMtfvel 
docotfer unit 



Stream 
Decoder 



^pass_throuflri,dol.tength 



pass_thiDuoh_enatite 



pnsv.Iine.source 



fteBiscorond 



flnes^remaining 



Bne_iength 



Command 
ControUer 



15. 



1 

c 

8 

±_3t 



^..flftisriedband 





Next Edge 




Unit 



UheFiU 
Unit 



<ft<JtKLpldata 



s<U-U d_rfy 



data 



datavaSd 



End of Band 
Unit 



ttxj^sfu^advwQT j 



.«IUL8dv1ir^ 



19 lb^<ftj,%wiag 



wdatavaHc 



PfBVfcjUS 

Une Buffer 



Spot FIFO 
Unit 



Next 
Line Buffier 



Figure 113. Block diagram of lossless bl-level decoder 
The LBD contains the following sub-blocks: 
Table 108, Functfonal sub^locks In the USD 



Registers and 
Resets 



Stream Oecoder 



Command ControUer 



Next Edge Unit 



Une Fill Unh 



A«essesme bWevel description from the DRAM thnxiflh the OtU Inter- 
face. n decodes t».e bit stream mto a command «/l(h aroumems wWch i, 
then passes to the command oontroller. wuiiwms, wnicn it 



Interprets the command from the stream daeodar Z — ^ 
urun^ a limft address and cotorto^lS;:^',^:?^^^^^^^^ 
pmvldes the next edge »nit startio,, address to loofc for tt^ n^^ne 



rr"** """S address to find 

the next edge of a color provided by the command oontroller The ne« 
edBe«,it outputs this as the next cunent address blcMo me oom^ 
oommnerand sets a valid t,lt when this address «t the ne^^ge 



OmH -wl^-I!" ?r Suffer wfth a color from its current address up to a 
Omrt address, "n^ color en6 limH are provided bv th. h ^I^^ 



Ooc: SoPEChardware design 
Version: 2.3 



S3 Propiletary Doixrment 



29^v 2002 
. Page 310 



SoPEC ; Hardware Design 



S3 



Naming of signals and logical blocks are taken from 118]. 
The LBD is able to stall mid-line should the <:Pi 1 1» . 

line frame due to band proccssi^ " ^^^^ " ^'^'^ ^""^ °' ^ current 

R^lstem and RMel»»ub*loc« description 

line regardless of what ^e oCt o^TSS is^ and acts as rt receiving all zeros for the previous 

3:i;cS"^rBifrit"^ir.ss^ 

pressed data stream. requesting data from the DIU and commence decoding of the com- 

Stream Decoder Sub-brock Description 



24.3.6 
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A dataflow block diagmn of the stream decoder is shown in Figure 1 14. 



ORAM Intefbce Unll 
T 




Flgufw 114. Stream decoder block diagram 
24.3.6.1 OeeodeC. Decode Command 

The becodeC logic encodes the conunand from bits 6 0 of tl,- kj» 

mands: SKIP. VERTICAL and RUN^^H u^^^^I a *° *r" com- 

consumed. which feeds back to fteSSfft Ji^'"""^ to indicate how many bits were 

as a medium nmlength this teU theT^ Scie^ ^^.^J ^ '^ f 3 1 . encoded 

length is decoded completely the LB^^^J ^^ru ^^f^. containing this run- 

be a number of bits that represrtuS«,SS^SSeJ^^^ "^^'S* ^" 

all these bits have been decoded si^S^^.tS" " P^SSJTHROUGH mode until 

or the line ends, which ever com« fSS^ ^' * programmed number of bits is ««iched 

24.3.6.2 DecodeD - Decode Delta 

15 bit number, which is g^^rfasTdS^dTTf^^f "^ ^ "° °'-"P"^- "Tb* o"tP«t « a 
AW page ^or 32,768), a 2 s complement representation of -3,-2.- 
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l^U^^orlc correctly for the data pipeline that follows. TMs unit also outputs how n^any bits were con^ 

^t'^e^c^^'iZnTy'?!;:"^^ S:;^ "ll'^tn^ ''^'^ ^"^^y - white 
and the cunem ""'"P* '^'^ «»°^ <>" the current color 

24,3.6.3 State^machine 

r::,^:^^^ ^ * ^ »^ ^-hine provides two SKIP instructions to 

passed. «.d the s^^^^tS'^e^Z^ ^^'fSlLCJS^Ttlf ' ^ ''^'^ " 
fetch ftom the command controller another Rnmr^r^-^^J^^ . instruction 

24.3.7 Command Controller Sub-block Description 

ing address to look ^ZZ^xt SoH ?^ P"'^*'*^ with a start- 

;^.^.earstchan.in^Sron^^^ 
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Rgure 115. Command controller block diagram 



24.3.7.1 State machine 



Hie following is an explanation of all the states that the state machine utilizes 
i START 

a AWAITJUFFER 

state when the ^^^^^S^^I^^O^^^G^^^^ '""'^ '° 'T' 
mand controller can proceed to the PAUSE st^ ^£UJiUNNING state. Once this occurs the com- 

iii PAUSEjCC 

due to band P-cessTnJSe'n^.'' c^^^^ can also stall oud-line 

decoder gets more of (he comDiessed dlL^»™ * S^x^ ''^^ "n^' ^^^^ 

decoder) or if^yfa Ibd ^ J«to1iS^d th^^t r 1^ * "^■'S «^ 

command cont^STer e^fo" y^^^l j? LBD needs to pause. PAUS£_CC is the state that the 

both asserted and th^e LBD 1 r^^r^^^^tr^Si"" *^ '^'^ '^-""'-^ 

iv PARSE 
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When in this state the command controller can nxeive one of four vaUd commands- 
a) Riinlengtfa or Horizontal 

™.d com,u„ wj, fH^'™ !St b.o " . am.), i, is »c«s„ fo, *e coo- 

into Ok WAlT^ORjajNLENaTH sou while this occu^ ^ Po*nt- The command controller 

troUer signals this to the rest of the LBD and then returns to the START stat ^ ^ command con- 

bj Vertical 

bi.* i, i,^^ ,„ „^ if . buT-'^ss:.":^''^ » i 

element on die previous line for a Verti^.t^-rTTif^ ^ is relative to the changing 

Skip 

thatthe comnKmdc^J^Sir^^t^^;^,^^^ """r skip commands 

the cunent color in this case. ^^^WCO) commands and has been coded not to change 

d) Pass ThiDugh 

^'^^^^^^<:Zt^^^7t'' "^^Z ^"t ^'^^'^ «>at is uses to construct 
LBDcanrec^;:S"o?^5^:„^p^^:jlSr^ » «>r"^<l - the stream decoder, the 

color as the last bit in un-compressS^^^^r^ mS'.'f'''": "^"T^ """^ ^ 

command controller as each pL through c^mSLlnS Jl^^ ^ ^ '° *^ 

cessed in one clock cycle. command received from the stream decoder can always be pio- 

V WAIT_fOIUiUNLENGTH 

clock cycle the command coSu^^^rs lo A« RUNLENGTH. AA« the first 

^^ATCTT/ data has been co:^ZTcZ:iZt^i^^V^ZZV^ ^^L^ 7'' f "^"^ 
controller will return to the PARSE state. Provided it is not the end of the hne the command 

w mtIT_FORj^E 

SnTi?rsi^^^ssrs"oS%tr:o^^^ 

remains here until the edge IVAITjrORjm state and 

remm to the PARSE sta^ " ™" ^^'^ command controUer will 
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Wi FINISH JJME 




24.3.8 



Ffflure 116. State diagram forthe Command Controller (CC) state machine 
Next Edge Unit Sub-block Description 

Conmjiler supplies Ihe .Will iddrem Vh^^SSi 1' . " ^ ">P«««»« "to Comnnml 
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Table 98. CDU registers 



0x14 



0x1 C 



MaxBJock 



BuflStartAdr 



15 



0x000 



0x0000 



BuffEndAcfr 



15 



NumBufflJnes 



0x24 



0x0000 



QxOC 



Number of JPEG MCU8|(6rJP^ 
I.e, 6x8 bytes) In a line - 1, ' 



Points to the start of the decompressed contone dr- 
cular buffer m DRAM, afigned to a half JPEG block 
boundary. 

A half JPEG Wock consists of 4 words of 256«bits 
encxiQh to hold 32 contone pixels in 4 cotors. i.e. half 
a JPEG block. 



Points to the start of the last half JPEG block at the 
«id of the decompressed conlone circular buffer in 

* half JPEG block boundary. 
A half JPEG block consists of 4 words of 256-bfts 

rjPEG ^ "^"^^ ^^^"^ ^ 



BypassJpg 



Defines size of buffer in DRAM In terms of the 
number of decompressed conlone Unes, The size of 
the buffer should be a multiple of 4 lines with a mini- 
mum size of 8 lines. 



0x0 



0x30 



NextBandCurr- 
SourceAdr 



17 



Determines whether or not the JPEG decoder will be 
bypassed (and hence pixels are copied directly from 
input to output) y»sjtn 

0 - don't bypass, 1 - bypass 
Should not be changed between bands. 



0x34 



The 2564)it aligned word address containing the start 
DHm^ ^ compressed contone data in 

This \^ue is copied to Cu/iSourcey^drwhen both 
^♦1^ ^ AtoriSaneffiTMb/fl Is 1 . or whef> 
_gD transitions from 0 to 1 



0x38 



NextBandVafid* 
BytesLastFetch 



0x3C 



TTje 64-bit aligned word address containing the last " 
bl^f the next band of compressed contone data in 

1!^^ "^^^ ^ EndSoufceAdry^en when 

bothOo/»Sa«/is1andAfexfflia/idaiatt^ls1 or 
when Go transitions from 0 to 1. 



Mask containing a 1 in each bit position that repre- 
sent a vafid byte tn the last 64-btt fetch of the next 
band of compressed contone data from DRAM 
l^Z « "^^^^ ^ ^^^esLastFetch when 
hoth DoneBandls 1 and NextBanaEnabte is t or 
when Go transittons from O to 1, 



Read-only registers 



When NextBandEnabfeis 1 and OortoBandis 1. then " 
y^encdu_mish6dband\s set at the end of a band 
*NexteandCuf7SourceA<A'\s copied to Cu/r- 
SourceAdr, 

-NextBandVaiidBytesLastFetch is copied to Wj/Af- 

BytesLastFetch 

'DoneBandls cleared, 

-NextBandBnabie is cleared 

NextBandEnable is cleared when Go Is asserted. 
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TMe 98. CDU registers 




0x40 



0x44 



CurrSourceAdr 



0x48 



0x4C 



endSoufceAdr 



ValrdBytesLast- 
Fetch 



JPEG decoder core setup reglstens 



17 



19 



0x0^0000 



Specifies whether or not the current band has firv 
«shed loading into the local FIFO. It fs cleared to 0 
when Go transfttons from 0 to 1 . 
When the last of the compressed contone data for the 
band has been loaded into the local FIFO the 
crfu_/7/7/s/jedbar«y signal is gNen out and me 
Oo/ieBandflagisset. 

M NextBandEnable is 1 at this time then Cum 
SourceAdr EndSoufceAarana VafidBytesLastFetch 
are abated with the values for the next tai^dw^ 
OOoeSandis cleared. Processing of the next band 
starts immediately. 

rn^!"!^^"^^ *^ ° remainder of the 

^^.Tl '^^"^'^ ^ decompressing the data 
already loaded, while the read control unit waits for 
/Vex^£hab/e to be set before It restarts 



0x0.0000 



0x00 



ril^rtTrl^^^*"^"*^ word address within the" 
current band of compressed contone data in ORAM, 



"'^"^^ "^"^ ^^^'^ containing the las? 



Mask containing a 1 in each bit position that reora- " 
sen^ a valid byte in the last S4-£SX^^7eui 
band of compressed contone data from DRAM i^e 
tower 3 bytes are valid, then the lower 3 htecrf l^^! 



1 OXM) 
1 0X54 


JpgDecMask 


5 


0x00 


^l^iSJ?"^ decoded they can also be output on " 
f^'V WP?^^'') port With the user sete^'^ 

SsixrtS^r ^"^^"^ 

4S0F+SOS+DNL 
3 COM^APP 
2 0RI 
1 OQT 
ODHT. 


1 0x58 


JpgOecTType 


1 


0x0 


Test type selector 

? ' ^^^^^^^""^ <iisplayed on JpgOecTdata 
1 - QDCT coeffictent displayed on JpgOecTdata 


OxSC 


JpgDecTesten 


1 


0x0 


SignaJ Which causes the memories to be bypassed 
_ror test purposes. yK«««»«u 


^IPEQ decoder cc 


JpgOecPiype 
>re read-only status i 


4 

legtsters 


0x0 


^f?,^"^ parameters to be placed on port 
*/pffOecPl^/i/e (See Table 99), 


I 0x60 


JpgOecHdr 


8 


0x00 


i^wted header segments from the JPEG stream 



Ooc: SoPEC^hardware^design 
y/ersion: 2,3 



S3 Proprietary Document 



29 Nov 2002 
Page 274 



Table 98. COU registers 




0x68 



JpflOecPValue 




0x0000 



16 



0x0000 



1 3 - TSOS output of CS1 650, Indicates the first out- 
pot byte of the first 8x8 blodc of the test data 
1 2 - TSOB output of est 650. Ictdicates the first ou^ 
put byte of each 8x8 btock of test data. 
1 1 -O - 1 1 -bit output test data port - displays OCT 
coefficients or quantized coefficiems depending on 
value of JpgOecTTypa. 



Decoding paiameter bus whfeh enables various 
parameters used by the core lo be read. The data 
aviUlaWe on the PValue port is for information only 
and does not contain control signals fbr the decoder 



Bit 21 - A«.<»n?_staff (if set. indicates that the JPEG 
core Is stalled by gating of jcllc as the output JPEG 
nalftlodc douWe-boffers of the CDU are full) 

JPEG decoder core and Is asserted when a pixel 
Is being output ^ 
Bits 19-16 - m^contents (FIFO at faipiit of JPEG 
decoder oore) 

CS6150 (see Table 100 for descriptfon of bits) 



22.5.3 Typical operation 

The CDU should only be started after the CFU has been started 

/Lines. Users then set the Sui^Wt^^^^?^^"^^^ BuffEndBlockAdr inA NumBuf. 

for the band has finishcSSiL^^i^^fSS^^^^ 

indicating that the memory ^o^oty^^^t^f'^r. ^ CPU 

band of contone data. "^"^ ^ ^''^''^ now free. Processing can now start on the n«« 

i^SsSKSd'^^sr;^^ 

for restarting the CDU bet^en ^ ^ * ^ Ne^andEnabte. There are 4 mechanisms 

- Z).^Wbit. -n.e 

«jenextb»:2~Si:^i^^^^^ 

advance and store the band commands in DRAM relS' for ex^uSon """^ " 

d.This IS a combination of 6 and c above The Pn r /^n.f»,«. »i, *u ^r., t . 
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registers and sets the NextBandEmble bit befoie the end of the current h«„rf a» ^ 
current band the CDU sets nnn„R»«j ~ i "~ ^ °' current band. At the end of the 

22.5.4 Read control unit 

JS>rrr F^o''S'rJ^^Sd^"L*^^ ^ it to the iPEG 

receiving the data from tke DrnTIrT^.^rr.^L^^^^ - 256.bit accesses, 

accesses to DRAM is described ^0^9^00 it^K^ S" "^1^' 

by means of the state machine described faiK^ToL " implemented 

rordTaLTtS^^SSa,"';^^^^^ • ail counters and flags 

itwhethertoatterSt^^t^doftfr^SS^^^^^ 

does nothing. When DoneBandTc^^^t^uT ^ When A,neffW is set. the state machine 
up to 256-Kts at a time white Set^* ^ a« St LTfifo °M f T T '^^^ """^ "''^ 
knowledge about numbere of blocks o™^^ . . has no 

by conseLtive reads from DlJiS^c STl^Tr^no^^^^^ * " '^^'^'^ ^^^0 input HFO foil 
atleastatthepea^DRAMreadband^rfLT^S^^^^^^ 

diu_cdu^alid being as^ l! ^ ^ ^di^^^d by 

end_of_bandstore: « compared to both end_source_adr and 

* J^i^^^TlOoT^^jf^^^d^^*?^^^^ 

is set. nje remainii 64-bit values S W^t ff^^ ""/^^ 

theHFO. Ignored. i.c. they axe not written into 

' ^-^jxij^upisTto^sni^^^rrr ^^^"^ -o<.n._«^. then 

whether cw,; soun^^^ ^^^f^f^-^'l'^'^^^ ^^ouroe^adr + I. depending on 
FIFO is 0. ^ end_ofJ,andstore. The end_ofJuind control signal sent to the 

a<rr^tf/rB_«^ is output to the DIU as crfi/_rfm_rad>: 

A count is kept of the number of 64-bit values in the FIFO WI,«, , - . 

fl» FIFO. No« i, i, Z^^f b jJrScTp'lG S:*-'"-? "^-y «. "c*. 
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Beset OR pfRt 

cdu_diu_iTeq = 0 
ignore^data « 0 

1 

^ reset ^ 



lgnore.data « 0 



AND nj eot|nt 



cdu.diu^eq ^ 
ignofe.data « 0 



< 



idle 



> 



DonaBflnri = - 



cdu_d(u.rreq«0 
ignofe_data = 0 



req 



3 



odu_cfiu_rTeq « t 
ignore^data « 0 



ack 









cou_qiu_<rreq b o 




fgnore.data • 0 







jcurr sQiiffM grfrryf t^ffnntf ■ 

ey s^urr^ o;}f""" ^""""t- 



cdu_diu^rreq « 0 



^ read ^ — 



Figure 101. State machine to read compwased contone data 
22.5.5 Compressed contone FIFO 

«n to the FIFO from the DIU u o/- iZi ««, i. "*lt is«Tit- 

^»««,^.»co..M.»«u,«j!;isrsro'o^*f^';r^^<^s 
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■ FIFO „ «Mi,i„^ dfect of teli^rh,^ comfmsed con»„e (tol. has be™ »>d ft,™ 
22.5.6 CS6150 JPEG decoder 

the CS6150 JPEG ^^^^ZZ^VS^'^oT"^:^^^^ '^"^ (An^hio^ve staS^i^t 
which a gated ve^icm of &e ^mTock i!flc G^. a"T f^^'^Sy)' "^^^ «>^^ clocked hyjclk 
JPEG decoder on a single coloVT^S h^X^ Z *1 l^"" 2™'^'*" * mechanism for stalling the 

the Pi^OutEnab input U, tSVEo1£5« H^ev^^ °' T^"* '^^^ » P-vidll 

block boundary and is insuffid<»tfar^Er " ^ ""^"^ ^"'« « JPEG 

instead tied^ >nsuil.cient for SoPEC. Thus gating of the clock is employed and PixOutEnab is 

s^^coirthi^eS^rti^!^^^ 

quantization tables, restart inlerval d«Ztion^d fe^.^^^ *^ *^ ""ff""'^ tabl^. 

the JPEGbytestic;^ automSyS^a^d?^ scan headers. The decoder parses and checks 
fying the JPEG segments ^^^^^^^Z^f^if^ "^^^ "-^^ After idend- 

as appropriate. Any errors detected in ttet^o^^ '° t "PP"P^,"^«s ^ be stored or processed 
sigr^edand.ifanerrorisfound.thei^r^ 

Lmes (DNL) marker at the end (normally nJc^si;^^ J^kS^»"^' ' '^^'^ ^'"""'^^ 

S:r£e?r ^« ^^agrams of the 

length as this is a modification to thc^oS " "^S*^ 64k lines 

Pixels in the c»^color:r'L^^rdL:ir«:4SS^^ 

nie followbg subsections describe the means by which the CS6150 internals can be made visible. 
22.5.6.1 JPEG decoder parameter bus 

mines which internal parameters are di^l^d o^ thf T^J^ t ""P"* C^/'gDec/'Tipe) deter- 

the PValue port does L cont^ ^nJoT^^'^'fyTS l"o" '^''^ 
Table 99. Parameter bus definfdons 



0x0 



Oxt 



0x2 



PYnS:0] 




PXflSrO] 



00,YMCU[13:0] 



FY: number of fines in frame 



FX: number of cotumns in frame 



YMCU: "umberofMCUs in Ydirection of <he current scan" 
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Table 99. Parameter bus definitions 




0x4 



0x5 



0x6 



0x7 



0x8 



00_XMCU{13:0J 



CsOprOLTqOIl .•OLVOp.Ol 
^H0(2;0J 



Cs1[7:0LTq1[i;0LVl{2:0l 
_H1(2:0J 



Cs?l7:0LTq2t1:0LV2p:0] 
.H2[2:0] 



Cs3I7.-0LTq3ri:<^V3(2:0J 
H3p:0I 



0x9 



CsH[15:0] 



OxA 



OxB 



CsVI15:01 



DRfflSrOJ 



000_HMAXp:OLVMAX[2- 
OL MCUBLK[3:0LNSf2:0J 



XMCU: number of MCUs fn X Sre^nof 



CsO: Wenlifier for the first scan component 
TqO: quantization table identifier for the first scan compo- 

vSji^^lT'^^^'*^ factor for the first scan component. 

HO: horizontal sampling factor ibr the first scan compo- 
nent. Values = 1-4 



Csl. Tql, VI and Hi for the second scan component 
VI. HI undefined if NS<2 



Cs2, Tq2. V2 and H2 for the second scan component. 
V2. H2 undefined if NS<3 



Cs3. Tq3, V3 and H3 for the second scan component " 
VS. H3 undefined if NS<4 



CsH: no. offDws in current scan 



CsV: no. of columns in current scan 



ORI: restart interval 



HMAX: maximal horizontal sampling factor in fiBme 

Y!^, "^^^^ ^'npling feclor in frame 

^^^^.T^^'"^^^ per MCU of the current scan. 
Tom 1 to 10 - 

NS: number of scan components in current scan, 1-4 



2Z5.6.2 JPEG decoder status register 

the JpgDecStatus register ThTcsfil ^ ^^^"^ "^'^^ ^^^^ «>y reading 
high to indicate an error condidon as defin"?^ iSle Tof ^ axe set to zero at reset and active 

SHB~1^— ^^^^ 

Zj ^J^' "I^^^ '^gcoder status r egister definitions 

^ 
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CtfEfTor 



HtError 



QtEnror 



OecError 



IDctfnProg 



OecinProg 



JpglnProg 



set When an fnvalid SOFparamerer'or ir^^^^^S^p^^^lStelS' 



Sal when an invalid OHT segmenl Is datected. 



Set when an Invalid DQT seflmen< is detected. 



Set when at^thing other than a JPEG jTMrfcer is iivut " 

Set when any of DecFlags(B:4} are set 



.^^'g^^^sM^taTa^'^'"'^ 



scan « complete. It Indicates that the cor e is in thel^^*^, 



' w „ , u »y uQwuing si aie. 



22.5.7 Haff^lock buffer interface 

to stall the JPEG decoder core ^ S^^nt^^SEGul^^^ ' ™^ to be able 

pixel). We provide a mechanism for stXs JTec boundaiy. ,.e. after 32 pixels (8 bits per 

JPS.corejtaa is 1. TTie h^fZ^S^lt^^f. ^^^ ^ «Iock to the core when 

half JPEG blocks to dec^DirjPFO TLT^ ! i«sponsible for providing a set of double buffered 
DRAM (write control StTnLco^^"^^^^^ JPEO blocks to 

onlyasinglecolorpteTLSti^Seli^e",^^ 

'^^^l^%t^^^-,oT'^ ^^'^ '-'^-^"ock buffers and some simple 



lutir-btoGk buttor tntaif ace 



ptx_out_vaiJd - 



PQ_core.stall ^ 
jdk^enable ^ 



half'falock txiffer 
select unft 



contone 
plane 
buffer 



j£_advj)alf_bteck 
— rd.adv 



half.biock_ok_to_fead 



64 
-7^ 



cdu_dtu.dat&(63:0J 



Figure 102. Block diasram of half-Wock buffer Interface 
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22.S. 7. 1 Half "block buffer select unit 



single bit (^buff) for tJe c^^^^^^S .k^ "^"^ ""^^^ 

^«#.W/.._/«i57.Whenl,^c.;S T'"^ equals 

the production of pixels Tte cTocIc ^noZ « / the JPEG decoder core is gated off so as to stop 

outpm from thc Cm.V^^^cl!^^^^^^ 

(/c/*..«.W. is the inverse ofi;^:;^^^ ^ 7c«r equals pc/fc V^en Jclk.enable is 0, yc/A: is 0 

is incre- 

P^^our^-alU/^cd^th^^^^^ '^^ output nr^en equals 

ANDed with arfv of jpg^core_stalL The output rd_en equals hal/^block^ok to read 



22.5.7,2 Contone plane buffer 

Each^wntone plane buffer consists of two half JPEG block buffers 



as shown in block diagram foim in Fig- 



rtiLbuff.. 



wr_buff_ 



8 



fd ftn 



pteef_€iata ^ 



JPEG 
half-block buffer 0 



pixel data ^ 



JPEG 
fiaif-Wock buffer 1 




odu_dlu_data(63:0J 



contone plane buffer \ 



Figure 103. Contone plane buffer Interface 



tecKd M .h« fim .ha Shd !b«Si«t>;'^ «cond shift „g«„ is 4 X <4-bit D«a b col- 
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Write control unft 



22.5.8 



J3 



4rjne 



ORAM ¥ion$p 
DRAM word p«4 



DRAM 



4 line 

StOTB 



DRAM word 

ORAM word q 
ORAM word 044 



DRAM word q44n 



JPEG Wock 0 
fines 0 to 3 



JPEG block 1 
iinesOtoS 





±0 1 c 


to t c< 




cIli I c 


LI 1 C 


LI 1 a 


ILT 


C}L2 1 C; 


12 1 C 


12 1 a 




C3L3 t C* 


' — i — ^ 
tU 1 c 


^ — *— ^ 

13 1 CC 


^ — 



wordp 

WOfdfM^l 

wordp42 
word 



JPEG Wock n 
Ones 0 to 3 



JPEGbfockO 
lines 4 to 7 



i — 

JPEOtKocfcl 
Ones 4 to 7 











JPEG block n 
fines 4 to 7 




'"P5es4x64bjtwriiestDconsecuth« 

fjords In one ORAM row. kS^lhSte 
COU access to ORAM -*"'b« 

CX-CokirX 

LY - Une Y or 8 b/tes of a Bne In a JPEG bkx* 



Rgu™ 104. DRAM storage arrangement for a single Hne of JPEG 8x8 blocks in 4 colors 

ro„owsBeW..coLpo.S^--tJ-^^^^ 

blocc 0. color 0. line 0 in word p bit, 63-0. line 1 i„ j,,,, 

line 2 word p.2 bits €3-0, line 3 in word pO bfts ea-'o 

blocK 0. color 0. U„e 4 in word a bie= S3-0, line 5 in word ,*1 bits 63-0 

Ixne 6 xn word <,*2 bita 63-0. line 7 in word ,0 bita 63-0, 

block 0. color 1, line 0 in word p bits 127-64 i < ^ 

i<™ •> • wJ-tB i.^/-iia, line 1 in word p+l bitfi 

line 2 rn word p-i-2 bits 137-fi4 •> i »ics i^7-64, 

s> * Dica 127-64, line 3 in word p+3 bits 127-64, 

block 0. color 1. line 4 in word q bits 127-64 H„» q < 

n-_ c • . uics j.^/-5«, line 5 in word a*l bits ut-ka 

line 6 xn word q*2 bits 127-e4 n.,^ •» • 127-64, 

1 ^ oits 127-64, line 7 in word q+S bits 127-64. 

repeat for block 0 color 2, block 0 color 3 ....... . 

etu:. 



in 
IS as 
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oly « bits out rfU,. JSfrbil ».«,s » DRAM^X^^T""^ """"" I*™ CDU 
by U» DIU. llis mews dat the deoompreiTc^lr^ He »n.,m» bte of th. wnte a« ™ W 

block to DRAM. Once the half-block b^lff^TnltJ^f *° ^*^P» a half JPEG 

n«,ues.s a ^vrite access to DRAM by^sX c^^^^^? ' ^ -^"^ 

.ng to the first 64-bit value to be written^T^f ^^^ro^J aT^' "^^^ ''o-^^Pond- 

access of 4x64 bits is issued by the CDU The Dru7, . "^^it value in each 

fourth 64.bit values). The state macl^eAe^aSyo ^etTr^' 'f^^'l' '^'^ ^ 

Mg a read of 4 64-bit values frx.m the hal^WoTb^erl^SS^!^ adaiowledge ftom the DIU before imtiat- 
put cdu_diu.wvalid is asserted in the cycle afS rf^^^^T ^ "^f^^^ out- 
thc cdu^diu_da:a bus and should Z *° « P'^sent on 

is then sent to the half-block bu^l^tS t .kI^T ^'^^ ^ 

s^.d„^bea^..,tobe^ttento:S!rr^^^^^^ 

i?dS:r^^d^:'°cre^-^^t^^^^ 

cleared and twrjtaljblock «6- gSToadS^^S^ ""^ from 0 to 1 all counter, and flags shouldt^ 
buff^tart_adr*nuLj,lo^lf^ *«#_^terf_Wr and upr_halfblock_adr gettloaded wi& 

// assian write address output to DRAM 

cdu_<uu_w.art4..3, . color " i^^^h^'::,^:,:!-'^' 

if (half == 1) then 

ir^Xadii^Sa^^rr^rtne'n*^ -^-^ -cces. 
If (half 1) then 
half s 0 

if (color == maxLplane) then 
color 0 

if (block max^bloelc) then /✓ • • 

pulse wradvSline wrxtxng a line of jpeg blocks 

block = 0 

^^^-^^^'^^lock adr « buff m--*-*. --a 
^^^^"■^-halfbloclc^dr = buf f_st;rt!«dr »'"«-end_«dr) then 
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else " UPr_halfbloc>^«dr * ™«^block ♦ 2 

block ++ 

else ' address for lines 4-7 for next block 

color 4"*- 

else 

half 1 

if (color s= inax_plane) then 

if (blook n^^bioc^c, then // end of writing - ai„e of JPEG bloc., 

else 

lwr_hal£block.adr = lwrJ«ilfblocK_«ir * ma^bloek ♦ 2 



else 

lvnr_halfblock_adr 



// move to address for lines 0-3 for next block 
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ResatQR prst n«=Q 
cdu_dru__wreq = 0 
odu_cfiu_wvaijd s o 
rd^adv « 0 

T d,adv,hatf_bloc k«o 

reset ^ 



Go cap 

oduLdltj.%wreq a 0 
odu_d(u.wvaDd s o 
fcf.adv«0 
fd_adv_half_btock a 0 



c 



odu_diu_wreq o 0 
odu.diu.wvaJid e o 
nl_adv B 0 

rcJ_acfv_half block -i 0 



req 



> 







naiT DH?CH ok m read^t 
cclu_dlu_wreq s I 
odu__dtu.wvaIid s 0 
rd_advBO 
^fdjadv^haHLWockoO 


C ack ) 




dm_cdu wacfc«, 1 


cdu_diij_«vvalld s o 
rd_adva« i 

fd.adv_lia<LblockaO 


read ^ 




cdu_dlu.wreq«0 

OdU_(fiU_WVB0d e 1 
rd_adv o i 

rd_adv_halLblock » 0 


writel ^ 



c 



cctu_dJu_WTeq is 0 
cdu_diu.%WBfid B 1 
id_adv « 1 

'd,adv_half.hiock « 0 



cdu_diu_wroG » 0 
odu_dtu_vwarid « o 
*d_adv » 0 

rd_adv_half_block»0 



write2 



c 



cdu_dlu_wreq c o 
cdu^dfu.vwaOd » 1 
rd_adv s t 

»d_advj»anLblock«1 



writes 



c 



3 



cdu_d}u_wreq c o 
cdu.d]u.wvai)d « 1 
rdLadv « o 

«l-adv_halLblock«0 



writc4 



> 
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22.5.9 Contone line store interface 

write to. Thus the size of the line storein DR^Jf^^^.S^ -T^ ^^^"^ '^'"^ CDU to 

Une ston, interface is 8 line^.^oS« a^^hSt^ ^ "t^^ ' f ^ "^^ 

schen^e while 16 lines pro.i^ l^^l^^tsl'^Z ^ ^"^^ b'^" 

s^^t D^"r::et:ir^ero£r«e^^^^^^^ -^-^^ 

set to the value of num buff T?ie rmj transitions from 0 to 1. ««m_/,«e5_av«/ is 

available for 8 lines. inS^T.!ZhI^t^l'°^^°^^^ »° °RAM as long as there is space 

writing 8 lines, the M^ite conS>\^tse^Z J^^^^- -T "'^ " ^'^^ ^ fini^ed 

/toe^tor»_oJLto Mrtetobesetagain^r?mic ^ J^r "^^^ ^''i's for 

Priately. ands^ni its o^^^ r^SSi to^e cS3? "'l' '''[.'^"^'''^ '° *m,^*//«e pulses appro- 
It finishes ^ U«n.. ^^^.^^^^SsttSe"^^ 



Doc: SoPEC^hardware^design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 286 




S3 



23 Contone FIFO Unit (CFU) 



23.1 Overview 



^^^mJI^S^ ^'SZ^S^^^^^ r-P--'* -tone data layer .otn the 

color im^ion m up to 4'^.olo^^°''^r^Tnfi:^r^ """^f" '° ''^^ ''^ 0P«°"^ 

fonned in the horizontal and veSical iS^o.^^y't^Cm S dT* "'if''"- "''^'"^ P"" 
pnnter resolution. Non-integer scaling is ^^Li^i^^ ^'^ *° "^^ matches the 

23.2 Bandwidth REQUIREMENTS 

The CFU must lead the contone data ftom DRAM fart ^ ^ 

is consumed by the HCU. the rate at which the contone data 

Pixels ofcontone data are replicated a X scale fectorr<5P^ . . 

factor (SF) number of times ^ the Y ifection toSlSt^hT^ '° ^ ^ 

diirction is performed at the output of CFU o^r^^r^fK R^Plf^^ion in the X 
tion is perfomxed by the CFuTaTg ^h J^e a n!.Sw«f h""' ^T?^'" ^« ^ direc- 

DRAM. The HCU generates I doTftf-Sin fi.«r^ '^"*°« Y-scale factor, from 

1 «de per 2 secondf for fulfblcS A^^lp^^'^.rS^^^ a P™. ^ed of 

color contone pixel (32 bits) eveiy SF cycl« W& fun^rt IVa ^ buff« needs to be supplied with a 4 

from DRAM at 533 bits/cycle" ^ °' ^ PP' must lead data 

23.3 Color space conversion 

bTeS^otXTr/^E"?; " ^J^t^l^^ r •^'^ -^^^ represented 

and K di-ctly represented bfciS^ 5^ 2^^^^^ the four color, may brc. M. V. 

muIti-SoPEC printing with exact cololi * ^ "P"^* ^old. metaUic green etc. for 

ci^cl^JSS^'^'^S^°:'X " ^f'^^^ -hen luminance and chromina™:e 
luminance infoLti^andt .SneedTo be3„^^^^ be luminance, but C. M and Y each contata 
foreprovide the means by ^^cH^C^ ^''^'^'^,^^^^ tables We there- 

sion. w dorti^ as YCrCb. K does not need color conver- 

ZpritlAt'dtrSS^S^^^^^^ 

to CMY ^ • ^ ^^"^ chtain^ then color converted to RGB. and finally back 

The external RIP provides conversion from Ron tn vrw-n. 

implementation of the inverse transfo^^^lfsfpEC^^^ "^^^ "^^^^ 

are nomiaJized to occupy all 256 levels of an 8-bit biSy^i^<S ^ 

The CFU provides the translation to either RGB or CMY Ran i= a ■ 



1 . 32 bits / 6 cycles = 533 bits/cycle 
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IS no 



1 color plane, no color space conversion 

• 2 color planes, no color space conversion 

• 3 color planes, no color space conversion 

• 3 colorplanesYCrCb. conversion to RGB 

• 4 color planes, no color q)ace conversion 

• 4 color pl„es YOCbX, co™„i™.„f YCK* » ROB. „ color „f x 

23.4 Color space inversion 

Su^ifrci^^sf^rp^^^^^ 

-ybeusedtoprovideplan^c^SLtS,^^^^^^^ 
The RGB to CMY conversion is given by the relationship: 

• M = 255-G 

• Y =255.3 

I^Ti'iS!?^'''' require the page RIP to calculate the RGB from CMY as follows: 



R*=255-C 
G=255-M 
B=255. Y 



23.5 Scaling 



seottd b, . nuncMor^nd a fc^S^l^™ ^ ""^ "f;""?" sealing witt ite scale (Iraor Spre- 
shouM be grc,^ am o?t,l?MTXo2ii^ «P of <be pixel <taa b allowed, i.e. ae ou^Xor 
*e „r i. ^.^.^Z'^^'^n:^ ^Srj^ 2 b, a .acor of »„ .M a ban 



? IS gener- 



if (count ♦ denominator - numerator O) then 
count = count * denominator - numerator 

else 

count = count ♦ denominator 
advance = o 
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23.6 Leao-in and lead-out cupping 



SI 



block n below) will be the last JPEot^nl^u^^^^^Tu 7 of «he 2 SoPECs (JPEG 

line printed by SoPEC #2. Seisin t5rjS.EG bli^^ TfJ^ ^f^^ #1 and the fim JPEG block in the 
ately setting L LeadOuJ^Z^pZl^^^sJ^^^^^ ^PP^P- 
at the beginning of each line Tlie n^^^^, ^^^^^C *2 must be ignored 

Leadlncl^^ legiSer "^"""^ '^^''"^ of line is specified^y the 

It may also be the case that the CDU writ^^ m it t«^«. ms^ 1 1 » . 

as shown for SoPEC #2 bel^.SSs^ Ae ^S^tS/"^ " ^^^^J" C"^' 
spond to JPEG block « but the valJ^bTAnS^l^a.^^^n^- 

block m-I. nius JPEG block « is n^^ad L bySSj;^ the CPU « set to correspond to JPEG 



SoPEC #1 
lea(^ln aiea 
I 



SoPEC #2 SoPEC #1 
teacJ-m area ^ lead-out area 



SoPEC #2 
iead-out area 




SoPEC #1 prints left 
side of page 



SoPEC #2 prints right 
side of page 



/^/.grA register defines the size of the tara« ZlTr^!,u^ setttng Q^c XstartCount register. The Hctd,ine- 
m,ls the sLing of the ^^dfiS SK„f°i t HC^ ""^'^ '"'''""^ 



Doc: SoPEC_hafdware 
Version: 2.3 



S3 Proprietary Document 



29 l^v 2002 
Page 289 



SoPEC : Hardw are Design 

23.7 IMPLEMENTATJON 

Figure 107 shows a block diagram of the CFU. 



Contone 
Decoder Unit 




Figure 107. Block diagram of CFU 
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23.7,1 Definitions of I/O 

Table 101. CFU port Hst and description 





Mm 

ss 


MM. 
hS 




PCU interface 

1 pcu_cfu_sel 

1 pcu_rwn 


1 

1 

1 


1 

In 
In 


1 System reset, synchronous active low 
Block sele« from the PCU. When pcu_cfu_selia high both 


j pcu.aclf(6:2J 

1 pcu_dataoulf3l.-0] 


4 

32 


In 
In 


Common read/not.wno signal from the PCU 

PCUaddress bus. Oiriy S bits are required to decode the 
address space Ibr this blodc «~«~oin8 


1 cfii_j)cu_rdy 

j cfu_pcu_<lata(3t:0I 


1 

32 


Out 
Out 


Shared write data bus from the PCU 

mr>^.*°"f 2!" «ftUW/_/»y 18 high it Indicates 

POLjtataomhas been regislered by the blOe* and Ibra read 
cyae tnia means the data on thi nrai Hafe» m> k^kn^ 

Read data txje to the PCU 


1 DiU Interface 
j cfu_dlu_rreq 


1 


Out 


panled by a valid read address. 


1 dfULcfu_rack 

1 cfu_diu_radf(21:5) 


1 

17 


In 

Out 


Acknowledge from DIU. active high, indicates tfiat a read 
request has been accepted and the new read address can be 
placed on tlie address bus. cfu dfu raan 


j dju_cfu_rvalld " 
1 diu_data[63.-0] 


1 

64 


In 
In 


CFU read address. 17 bits wide (256-4?it aligned word). 
Read data valid, acth^ high. Indicates that valid read data is 
now on the read data bus. dA^d^ata. 
Read data from ORAM. 


1 CDUfntefface 
1 cdu_cAi_wradv80ne 


1 


In 


Write anne pulse, active high, indicates that the CDU has fin- " 
^shed writing to 8 Pnes of decompressed conlone data to the dr- 
Oilar buffer in DRAM and the data is available to be read by the 


1 cfu_cdu^idadvtine 
HCU interface 


1 


Out 


Read line pulse, active high. Indicates that the CFU has finish^ 
reading a line of decompressed contone data to the circular 
buffer in DRAM and that Dne of the buffer is now free 


hcu_cftj_advdot 

cfu_hcu_avan 
cfu_hcu_o0data[7:0J 
cfu_hcu_c1 data(7,-0J 
cfu_hcu_c2dalaf7:0] i 
cfu_hcu_c3datar7:0] t 


1 

1 

B 
3 

3 1 
3 < 


In 

Out 
Out 
Out 
Dut 
3ut 


Informs the CFU that the HCU has captured the pixel clala on 
ptxei on the data lines. 

Indicates valid data present on cfu^hcu c(0-3)data Rnes 
Pixel of data In contone plane 0. 
Pixel of data In contone plane t 
Pixel of data In contone plane 2. 
Pixel of data In contone plane 3. 
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23.7.2 Configuration registers 

f"?° *^ CFU are progiammed via the PCU interface. Refer to section 21 8 2 o„ 

oitcs; ot cju^cu_data. The coufiguration registers of the CFU are listed in Table 102: ' 
Table 102. CFU registers 




Control reg isters 
0x00 



Setup registers 



Reset 
Go 



0x1 



0x0 



A write to this fegister causes a reset of the CFU. 



Writing 1 to this re^ster starts the CFU. Wrftino o to thfs 
register halts the CFU. 

When Go is deasserteO the state^achines go to their 
Idle states but all counters and configuration registers 
keep their values. 

When Go is asserted all counters are reset, but configu- 
ration registers keep their values (i.e. they don't get 
reset). 

The CFU must be started before the COU is started. 
This register can be read to determine If the CFU is am- 
nfng 

(1 ' running, O - stopped). 



0x10 


MaxBlock 


13 


0x000 


Number of JPEG IWCUs (or JPEG block equivalents, i.e 
8x8bytes)inah*ne-l. 


0x14 


Buf^tartAdr 


15 


0x0000 


Points to the start of the decompressed contono circular 
butter In DRAM, aUgned to a hatf JPEG Week boundary, 
A half JPEG block oonsists of 4 words of 256-bit3, 
enough to hokl 32 contone pixels in 4 colors, i.e. half a 
JPEGbtock. 


0x18 


BuffEndAdr 


15 


0x0000 


Points to the end of the decompressed contone circuialr 
buffer in ORAM, aflgned to a hatf JPEG btock boundary 
(address Is inclusive). 

A half JPEG block oonsists of 4 words of 256-^18. 
enough to hold 32 contone pixels in 4 colors, i.e. half a 
JPEG bkxdc. 


OxIC 


4IJneOffset 


13 


0x0000 


Defines the offset between the start of one 4 line store to 
the start of the next 4 line store. In Figure 108 on 
page 294, if SufSearMrfrcon-esponds to line 0 trfock 0 
then BuffStartAdr-^ 4UneOffs0t corresponds to line 4 
block 0. 

Thfs register is required in addition to MaxB/ockas the 
number of JPEG blocks In a line required by the CFU 
may be different from the number of JPEG blocks in a 
line written by the CDU. 


0x20 


YCrCb2RGB 


1 


0x0 


Set this bit to enabJe conversion from YCrCb to RGB. 
Should not be changed between bands. 
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Table 102. CFU registers 



0x28 



0x2C 



0x30 



0x34 



0x40 



0x44 



HcuUneLength 



LeadlnCfipNum 



LeadOutCHpNum 



XstartCount 



0x0 



16 



0x0000 



0x0 



0x00 



XscateNum 



XscaleDenom 



YscaleNum 



0x01 



0x01 



0x01 



bitO - 1 invert coior plane 0 

- 0 do not convert 
bitl - 1 invert color plane 1 

- 0 do not convert 
bH2 • 1 Invert color plane 2 

• 0 do not convert 
bits - 1 invert color plane 3 
Should not be changed between b ands. 
Number of contone pbceJs - 1 in a (fne (after ecaling). 
Equals the number of tei_cftcrfoted^ pulses - 1 
fecerved from the HCU for each line of contone data 



Number of contone pixels to be Ignored at the start of a 
Ime (from JPEG block 0 in a iine). They are not passed to 
the output buffer to be scaled In the X direction. 



Number of contone pixels to be Ignored at the end of a 
line (from JPEG block MaxBlock In a Une). They are not 
passed to the output buffer to be scaled in the X direc- 
tion. 

Value to be toaded at the start of every fine into the coun- 
ter used for scaling in the X directfon. Used to control the 
scafing of the first pixel in a line to be sent to the HCU 
Thfs value will typically be zero, except in the case where 
a number of dots are dipped on the lead in to a line. 



Denominator of contone scale factor in X direction. 



Numerator of contone scale factor in Y direction. 



23.7.3 



Storage of decompressed contone data in ORAM 

The CFU reads decompressed contone data from DRAM in single 256-bit accesses JPEG hInrVc 
decompressed contone data are stored in DRAM xxnthfii*«-»^^ ^Jo-o" accesses. JPEG blocks of 

256-bit DRAM access. ^mc^tu reads 64-bits in 4 colors from a single line in each 
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4 line 
fiiore 



ORAM vwortf p 
DRAM word fy^ 



DRAM 



JPEG block 0 
Ifnes 0 to 3 



JPEG Wock 1 
Jines 0 to 3 



DRAM word p44n 
— ORAM word q 
DRAM wort 



255 1Q1 127 a.'X n 
c ay I C2L0 ! CILO i Cq lo} v«rcfp 

wordp>t 



C3^^ I C2L1 i CILI I qOL-i 



i C2L2 t C1 L2 ; Cn LP | v«fd i>*2 



JPEG bTocfc n 
Iln«3 0to3 



JPEQ brock 0 
Qn«s4to7 



JPEG block 1 
Iin8S4to7 



ORAM word q+4n 



JPEObCockn 
lines4to7 



255 



191 



cq^4 r 


C2L4 1 


C1U 




C3^ 


C2LS 1 


C1L5 


1 COLS 


C3^e 1 


C2L8 i 


C1L6 


1 C0L6 


C3^7 1 


C2L7 1 


C1L7 


f C0L7 



wDfdq 
word q+1 
word q-f2 
word q-fd 



(mpBes one 256 bit read of a word in DRAM 



CX - Cotof X 

ly - Une Y or 8 bytes of a line in a JPEG bk>ck 



Figure 108. DRAM storage arrangement for a single line of JPEG blocks in 4 colors 

sequence, as shown in Figure 108, is 



The CFU reads data line at a time in 4 colors from DRAM. The read 
as follows: 



line 0, block 0 in word p of DRAM 
line 0, block 1 in word p+4 of DRAM 

line 0, block n in word p+4n of DRAM 

(repeat to re«d line a number of times according to scale factor) 

line 1, block 0 in word p+l of DRAM 
line 1, block 1 in word p+5 of DRAM 
etc 



23.7.4 



Decompressed contone buffer 

On the DRAM side, wr_buff' indicates the current buffer within each rfoiihu K„m.^ •* 

to. H._.e/select3 which double-buffer to write the 64 bl^S^T^'^^^Jt^^fTs^eS" '^'^'^ 
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23.7.5 Y-scaiing control unit 



DRAM in single 256-bit accesses. Ae 2 fr^^h J^fn r TT*'' ""^'""^ ^ " 

The pn,tocol and timing for read ^^^ T^T? ^J"""' ' '^^'^'^^ ^^'^^ 'O'cle). 

^esses.DRAMar.i™p,e™ent^,rj^.^r:t^^^^^^^ '^eai 

&#_oJLto_Mr//e flags to teU t^e5,t t^^^«™^ '^^^ /«e*_oiL/o readZ 

When W_.*_r._W is O^l^e „iS ^ 

machinecontinuestoloaddataintotSd^Z^^^^ nothing. When line8.ok_to_read is 1 the state 
space available in the buffen 

(t/^isf£t':t:stxt't5i^^^^ 

that writes are to occur to. ^ ^ »«> occur fiom. and a single bit (y^_buffi for the cunent buffer 

*KfiLo*_to_w«te equals ~i«j^ava///Wr 6«<57 , . 

IS set. and H^_6u#is inverted VWieneve7^f\:A ^-ft^^ " f *H«:«va,7/Wr dwW 

of da«. ftom DRAM to the bufS^t^SiJll '^^^^ ^ ^ -^te ti 6i4>S 

SKSbM'ST;^;^^^^^ 

«/.,e„ and n^e/ gets incremented To foinZVn^ J^^ L^'fJ^ '^^'^ ^ 

vmtc the data to the output double-bSffer of Z Sj^en^ll^ u ^'''^ *° 

bill ^d^'.-is asserted. *,^_ava///.<._/^^is^S^^ ^^^-^ equals 

o^i^'X^eT^^^r-Je'^J^X^^^ CPUmoves 
direction is thus perfoimed. '^'^^ contone data. Scalmg to the printhead resolution in the Y 

// assign read address output to ORAM 
Cdu_diu_wadr(21:7) = eurr_halfblocJc 
cdu_diu_v,«dr(6:S) - line(l.-OJ 

^i^Z^^:^:,^'::;''-''"'''-'''^' aft,r e-ch DRAM read access 

WocT/o --Oin^ a Xlno ot contone in up to * colors 

// Check whether to advance to naxfc 

if (y_scale^ccunt . y.scaleJ:nor- 'yTcell ^ZT^T't 
y^scale^count . y,scal«.count * y scale 5eZn ' '""f" 
pulse RdAdvline y-scaie.denom - y_scale_nuin 

if (line 3) then 
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line 



// update 2ialf block address for- c-^^ * 

curr_halfblocic = buf f^startla^ 
Ixne.scart^adr = buf f^start_adr 

currJ«ilfl,locx : buf Jt^tla" ' " t>"«-«nd_adr) ) then 
line_start_adr = buff .start adr 

else "" 

rin:i:L'^':^ Une^start^adr . 411ne.offset 
axne^start^adr « Ime.start.adr * 41ine.o£f8et 

else 

line ++ 

curr_halfblock = line^start adr 
else ~ 

// re-read current line from DRAM 
y^acale^count = y_scale_count ^ y scale rf*.T,«m 
curr_halfblock = line^start adr ^''^*'^-^^"<«" 
else ~* 

block 

curr_halfblock 
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cfii_cliu_freq = o 
wr^sel « 0 



cfu_diM_#Teq e o 

wr_8el 1 0 
w»La<lv_buffaO 



■» (^ idle ^ 



Ldiu^rreq t> o 
hmLa<hf.biiffoo 



C 



req 



c 



>- 



5fff PK m wdta 

wr_adv,buffaO 



ack 



c 



wr_selciO 



read! 



c 



3 



read2 



dkj rft. rynfffj^-i 



wr_&eJ«: 1 
«W-advjbuff = o 



C 



read3 



c 



cfu_dlu_rrBq = 0 



read4 



> 



I wr.adv_buffa i 



Fte-ra ,09. State „«chl„e to read decon^ressed co«one data ORAM 
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23.7.6 Contone fme store interface 



The contone line store interface proyd^ZZ^uLT^/ T^' '^'^ line-at-a-time 

DRAM when the CDU has written 8^^;!^ f '^^ °»»y °°>y begin to read from 

lines, it sends an cA_c> Sl^^S fto 0^^"^ *^ ^'^^ ^ Shed^t^" 

CFl/maycontinueread^ifromDStCt £5^,^!^^^^ incr^nented by 8. Tl.e 

from DRAM, the Y-scaling S^^^^^^^rT^^'^^l^ ^'^''^ «^«»i«8 line of contone data 
CDU to free up the line inL bSerTo^ ^ it^Z^ ^'^^ » the 

v///iepulse. ermi^KAM. *«itfl/i/ieiL«Wfli/ is decr«nented by 1 on receiving a iW^rf. 

23.7.7 Color Space Converter (CSC) 

»c«.d s»8^ n,. -a, c^lor pu™, if^^t b»S^^?^ •^'^ "J *e i«P" P«ds passed to tte 
latency of lhec»™„ YQCb to RObCL^^ST w "» ""^B Nol. tte 

s:„:rx:r'^:'SrefvS^:^ ^ .r 
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Figure 1 10 shows a block diagram of the colorspace converter. 
[ converter^nmter 

B r 




^ cpi 



23.7.8 



Jnwertjcofar.plafi 



Figu™ 110. Block diag^tn of color spaco converter 

version is impleniemed as follows- ^ accuracy ,s maintained with IS bits. The con. 

• R* = Y + (359/256XCr-128) 

• Y-(183/256XCr.I28)-(88/256XCb-128) 

• B* = y + (454/256XCb-128) 

X-scaling controf unit 

^-SSSL^l^u^rpi:^^^^ space converter and the HCU. The 

^mechanism for keeping ,^ek of the ^Itta^^r^^tt!^^ ''T" ""^"^ P--^- 
read from untU it has been written to. ^ «>5"«^s that a buffer cannot be 

that wntes are to occur to. * ««ele Wt (nr.to^ for the current buffer 

-««v IS I . nxels m the lead-in and lead-out areas are 



1. -179 is saturated to 0 

2. 135.5. with rounding becomes 136. 

3. -227 is saturated to 0 
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if (wradv =^ i) then 

if <pixel_count <inaxJblock,blll} ) then 

pixel_count =0 
else 

Pixel_count 
if ( <pixel_count < leadin_clip_num) 

OR (pixel.counc > ({max block si 1 1 \ * ^ 

wr^en = o » i««x.oiock.blll> - leadout_clip_num) ) ) then 

else 

wr_en = i 

The output cju_hcu_avaa equals fa/if ^„,a///>rf buffi ^uy^ ^ [ ^-buffis mverted. 

HCU that da^ is available^ be r^-Z^^^^'c^' ^nCU~^TL '""^ ^'^'^'^^ *° 

algonthm for non-integer scaling is descriSdTS p«S^2L 

loaded with x_,rurt_ccu„r after reset and at the^Tof S^I^-JT "*'- should be 

fostpixe is scaled by. hcujinejength sndla^ X^tS^^Ll^" controls the amount by which the 
line that is sent to the HCU is scaled by. «>'»'rol the amount by which the last pixel in a 

if <hcu_cfu_dot«dv -o 1) then 

rd_en « 1 - ^scaie^denom - XL.scale_num 

else 

else 

X^scale^count = X_scale.count 
rd_en = o 

received, then a n/_en pulse is gematld topr^cntZiJ^~i^7l^'^'' ' Acw.c^.Wv pulse is 
reset to 0 and x^scale.count is loaded mS^A^LT ''-'-'-^--ounr is 



Ooc. SoPEC_hardware design ■ . 1 ' 

Version: 2.3 ~ ^ Propnetary Document ZT^T- 

29 Nov 2002 

^ Page 300 



SoPEC : Hardware Design 



24 Lossless Bi-level Decoder (LBD) 



24.1 Overview 



pass-through mode is Drovidekfo T,T™^T!'''° """^ "^"^ DRAM access is available. A 
50:1. Lossless bi-levcl compression a«^^avl?°'''°'°''''f^ 

which compass poorly. " P^^*' »^ "^^^^ 20:1 with 10:1 possible for pages 

o'^i^tT^rfsm^^oTM 

unit) for the next st^lu^ ZprSl'^cZ^ T n^^ '° (Halftoner/Compositor 

i3 used by the PCU id is avafS KL^oTc^ "^"^ " "^-^'"'<^ ^-trol fig that 



ORAM 
JmerfsceUntt 



(bdURnishecfiiand 



PCU 


4 


— 

LBO 







Spot FIFO 
Unit 



HCU 



24.2 



Figure 111, High level btock diagram of LBD In context 
Main FEATURES OF LBD 

Figure 1 12 shows a schematic outline of the LBD and SFU 

s'^Jec^J i.^L'L^^^^^^ *^oughput capability is retained for 

PECl LDB outputs 16 bi^ 1^ pa^ef ^ t raot ^ t?' """"Y '"^^ ' '^o^-y-'-- TT'c 
the LBD in SoPEC can run much fasterth^n !c ^ : SoPEC. Therefore 

processing latency, to be lbsoVbed! " ''^^ 

e.g. due to band 

length «Kle. followed by pass arough. ""-leogth oode » always e^ecntU tstrm- 

«»s op ,o . P4»»»bl. „Z?e^ii„^Sro?S^^^ L^^^ i?.,''"^^- ' 
r:;;:;;;^^o D«AM. The.fo„ ae LBD „os. ^,'■0''^^^° f „^','^i^ ^^j" 
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A signal sjujdb_r<fy indicates that both the SFU's NextUneFWn <,^a t> e^^.^ 

wnting and reading, i«spectively. riextuneFIFO and PrevUneHFO are available for 

Kbyres of storage. ' " "onige. An A3 line of 19488 dots requires 2.4 

T^e LBD finished band signal is exported to the PCU and is additionaUy available to the CPU as an inter- 



LBD 



FIFO 



SFU 




DRAM read 



All RFOs are 64 bytes 
(twice the DRAM cTata 
word width) 



prev_line 



64 

~^ — ► ORAM 



64 



ORAM read 



FIFO 



currjine 



HCU 



Rflure 112. Schematic outline of the LBD and the SFU 
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S5 



BUlevel Decoding fn the LBD 

length encodings. The encoding^are S:^; tSoI^^^ sio^Iified'nl 




^y^^P^t^tS^u^^^^^ -ugH .Ode . activated 

number of bits, Avhichever is shorter. TT»e%eciaI i^^S , °' * pre-prognunmed 

followed by pass through. The pass ^>ro,:^^^^^T''!^'''^ ^ ^ code, 
than or equal to 31. '^ ^^*^*'**"«"»™len«th run-length with a run of less 

iM y ■ '«"a*h (RL) encodings 




RRRRR1 



RRRRRRRRRRtO 



S 

2 S 



RRRRRRRRIO 
RRRRRRRRRRto 



RRRRRRRRIO 



RRRRRRRRRRRRRRROO 



RRRRRRRRRRRRRRROO 



Sftort Btack Runiength (S btts) 
^ort White Runlength (5 bits) 



Medium Black Runlengtti (ip bits) 



Medium White Runiength (e bits) 



Fn.frT «"n'enOth with RRRRRRRRrr af 
enter pass through ' 



Medium White Runiength with RRRRRRRr <= 3t 
Enter pass through 



Lx>ng Biack Runiength (15 b its) 
Long White Runiength (15 bits) 



Since the compression is a bitstream, the enc^ ^ . 

icant b,t). The run lengths given as RRRRR in Se 1 oTl sigm^cant bit) to left (most signif- 

the nght to most significant bit at the left). "^^^ Oes^st significant bit at 
There is an additional enhancement to the G/i fi»v oi«r« vi. • , 

for data to compress negatively using the G4 fSSoriS^' n P°«^iWe 

pass the data to the LBD as un'^compreTsed l^P^^^^ -aid be e'asier to 

mented m the PECl vei^ion of the LBD. When Ae Sd k ^ V'''^ ^^"^^ i°^P^^' 
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24.2.2 



the codmg sdime of TaWc 104 w.i 1 > ?^ ? " ' ""J^Wt. However imdei 

'■^^^'oMit^^^JSl'^'''^'" or long njrt^ph. T1» LBD to 

DRAM Access Requirements 

i-*ce»*.o,U.n=,^^.sDr^;^S.S^.2^\^'^^^^^ 
Table 105. DRAM bandwidth requirements 



Maximum number of 
Direction | cycJes bet%ween each 
2564>ft DRAM access 



Head I 256^ compressfon) 



Peak Bandwidth 
(bits/cycle) 



1 (t :1 compression) 



Average Bandwidth 
(bfts/cycle) 



0-1 (10:1 compression) 



1. At 1:1 compression the LBD requires 1 bit/cycle or 256 bits eve^r 256 cydcs. 
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24,3 fMPLEMENTATION 



S5 



24.3.1 Definitions of lO 



Table 106. LBD Port Ust 



Ciocks and Resets 



pclk 



prst_n 



Bandstors sfgnafs 



In 



cdu_endofbandstore(2t ;5] 



cdu_8iartoft)andstore[21 :5] 
Ibd^ffnishedband 



17 



In 



Address of the end of tfie current band of data 
25$>|)it word alrgned DRAM address. 



Ibd_dru_radr(21:5} 



dtu_lbd.rvalid 



64 



In 



In 



Acknowledge from DIU that read request has been" 
accepted and new read address can be placed on 
fOa^atu_radr. 



Data from DIU to SoPEC Units. " 
Rrst 64-bits is bits 63:0 of 256 bft word 
Sewid 64-bfts is brts 127:64 of 256 bit word, 
^trd e4-bits rs bits 191:128 of 256 bit word 
Pourth 64-bit3 Is bits 255:192 of 256 bit word 



pcu_addr(5:2J 



pcu,dataout(31:0] 



R)d_pcu_rdy 



32 



In 



Out 



In 



Out 



PCU address bus. Oniy 4 bits are required to decSJtettT 
address space for thfs block. «ewe me 



Read data bus from the LBD to the PCU 



Common read/not-write signal from the PCU. 



^.^^^ Pou-ftx^^setls high both 

pct/_addrandpa/,d;a/&oaf are valid. 



^^«^tK?. f^^cu^rrty is high it Indi " 

cates the last cyde of the access. For a write cycle this 

a read cyde this means the data on UxLpcu^Hataln Is 



lbd_sfu_advlln e 



In 



^^Zt^"^ *««llcatinfl SFU has previous line data 
avatlable for reading and is also ready to be written 
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24.3.2 Configuration Registers 

Table 107. LBD Conflguratlon Registers 



Contr ol reglstefB 
0x00 



0x04 



0x0 



A write to this register causes a reset of 
the LBO. 

Ihls register can be read to indicate the 
reset state: 

0 - reset In progress 

1 - reset not in progress 



Wrifino 1 to this register starts the LBD 
Writing 0 to this register halts the LBD 
The Go register Is reset to 0 by the LBD 
when It finishes processing a band. 
When Go is deasserted the stata- 
nnachines go to their idle states but all 
counters and configuration registers keep 
their values. 

When Go Is asserted ail counters are 
reset, but configuration registers keep their 
values (i.e. they don't gel reset). 
The LBD should only be started after the 
SRJ is started. 

This register can be read to determine iff 

the LBOIsninnlng 

(1 - running. 0 - stopped). 



OxOC 



0x10 



PassThrough Enable 



PassThroughDotLength 



16 



0x0000 



Width off expanded bi4evef fine (in dots) 
(must be a muitipre of 16 bits). 



Writing 1 to this register enables pass- 
ihrough mode. 

Writing 0 to this register disables pass- 
through mode thereby making the LBD 
compatible with PEC1. 



Number of dots for which pass-through 
mode win last. If the end of the line is 
reached first then passthrough will be disa- 
bled. 



0x14 



0x16 



NextBandCurTfteadAdit2l :5] 
(256-bit aligned ORAM address) 



NextBandljnesRemaining 



17 



15 



0x0000 
0 



0x0000 



Shadow register which is copied to 
CurrRoBdAdrwhen (NextBancfEnable ^ i 
&Go^ 0). 

NextBandCunfieadAdns the address of 
ttie start of the next band of compressed 
b<-level data in ORAM. 



Shadow reeister which is copied to Unes- 
Remaining when (NoxtBandEnable ^= 1 &^ 
Go^O). 

^MBandUnesRemaining ts the number of 
lines to be decoded in the next band of 
compressed bi>level data. 
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Table 107, LBD Conflguratlon Registers 









1^ 












^iBuow regisier which is copied to P/ev* 
UneSource when (NaxtBandBnabte i 

A Go i»e 0). 

1 - use the previous line read from the SFU 
wr uecoojng tne fiist line at the start of the 
next band. 

0 - ignore the previous One read from the 
SFU for decoding the first line at tho atArt 
of the next band (an all 0*s line Is used 
Instead). 


0x20 
Work registers (re 


NextBandEnaUe 
ad only for external access) 


1 


0x0 


If (/VextBandfinaib/e « 1 & Go «= 0) then 
'NextBandCunReadAdrts cotAed to 
Cii/r«ead4d/; 

-NextBandUnesRemalr^ Is copied 
to UnesRemainlng, 

(Mr ' vru/i9i^(M/fco IS oopieo 
to PrevUneSource, 
-Go is set, 

-^^lextBandEnabfe is cleared. 
To start LBD processing NextBancfEnabtB 
should be set 




CurrReadAdr{21 .-SJ 

(256-blt aHgned ORAM address) 


17 




The current 256-bit aligned read address 
within the compressed bHevel image 
(DRAM address). Read only register. 


0x28 
, Qx2C 


UnesRemaining 


15 




CJount of number of Unes remaining to be 
decoded. The band has finished when this 
number reaches 0. Read only register. 


0x30 


PrevUneSoufce 


1 




1 - uses the previous line read from tho 
SFU for decoding the first line at the start 
of the next band, 

0 - ignores the previous line read from the 
SFU for decoding the first fine at the start 
of the next band (an all O's line is used 
instead). 

Read only register. 


0x34 


CurrWritoAdr 


IS 




The current dot position for writing to the 
SFU. Read only register. 




RrstUneOfBand 


t 




Indicates whether the current line is con- 
sidered to be the first line of the band, 
'^ead only register. 



24.3.3 Starting the LBD between bands 

and then stODs clearing if« r^. kj* • ^^^^^ \ims win set LBD Go), The LBD decodes a single band 
for the n«°bi,?S fteScu<SS °" '^'^J^^^edtanJ. The LBD can then be LarteS 
u. ™e«eMCU continues to process previously decoded bi-level data from the SFU 
Thereare4 mechanisms for restarting the LBD between bands: 
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WL^'?PH "^""^ ^ w '° "^^ ha^e Stopped and cleared its 

*^2lZlr^ '^J^f^'^ ^^'^-'^CurrReadAdr, NextBandLinesRemaining and Next- 

;S3rr— — ^^^^^ 

AThis IS a combination of and c above The PCn rrathAr ti,o« ♦u^ /-^m t • ..v 

so the LBD restarts immediately. Simultaneously. lbdjinishedbandtng«sts the PCUtaWr 
Z^T. fo^nS; ""^"^ ^'-^^ and sets Ne.. 
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24.3.4 Top-level Description 

A block diagram of the LBD is shown in Figure 1 13. 



ORAM Interlace Unit 
~* * 



\64 



07 



lossress bJ'lttvel 
decoder If nit 



^pass_through_dot.ienflth 



Decoder 



_gass_thrDUgh_enable 



prev^line.source 



flnes^remainlnQ 



_gne_tenoth 



Command 
Controiler 



1 

i k 



^control ^ 
15 



tod.ffnishedband 



Next Edge 
Uftft 



UneFlU 
Unit 



_sfu_il d^rfy 



data 



V Idbj 



datavaSd 



End of Band 
Unit 



tod^sfu^piadvworj 



IbcLpldata 



1$ ibd|.sfu_wdi 



wdatavafit 
^ 



PfBVlOUS 

LJne Buffer 



Spot RFC 
Unit 



Next 
Line Buffer 



Figure 113. Block diagram of lossless bMevel decoder 

The LBD contains the following sub-blocks: 
Table 108, FunctfonaJ sub-bfocks In the LBD 



m^mm 






n«Hi9iers ana 
Resets 


PCU mteiface and conliaucaiion registers. Also genemtes the Go and the 
Haset signals tor the rest of the LBD ^ 


Stream Decoder 


AoM«esthe bHevel description from the DRAM through the OIU Inter- " 
face. H decodes the bit stream Into a command with arguments wW^ it 
then passes to the command controller 


ConimarKi Controller 


Interprets ttie command from the stream decoder and provide the Nne fiU 

omvlT. '2"^ t^!:,"^ '° Next Unre^i', "at? 
provides the next edge unit startinB address to look for the next edoe 


Next Edge Unit 


SMns Bwoirgh the Previous Une Buffer using its current address to find ~ 
the next edge of a color provided by the command controller T^e n^ 
edge unit oujuts this aa the next current address back to tt,e c^m^nd 
controller and sets a valM Ut when this address is at the next ed^e 


Une Fill Unit 


Rtts the SFU Next Une Buffer with a color from its current address up to a 
limit address, fhe color and limit are provMed bv th, command «n,^Te* 



Doc: SoPEC.harcfware^design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 310 



S°F?r. '^^^ <^^D6^^ fo, i. ^ ^^^^^^^ 



Naming of signals aad logical blocks arc taken from 118). 



24.3.5 



24.3.6 



The LBD is able to stall mid-line should the SFi I 

line frame due to band processing latency ^^^^ * P^^**^ °' ""'i^e 

a current 

All output control signals from the LBD must alwavQ ^^\\a 
Registers and Resets sub>block description 

ters. The register descriptions for the LBD «« listcdTTS.klS^* 

LBD. In the case onLReZZ^t^t^;:^-^^^'^^' »° -b-blocks in the 

LBD. *• »™>Der is oecremented for every hne that is completed by the 

* ^^^"^ '"•8* o*^** """pressed bits in pass through mode 

.i;^urss?rSrTur^^tl^i^^ 

LED ignores the pr^ous l^. i^^^^^^'^^^^'^- ^ -^'^ - written the 
line regardless of what the out of tS^fS is and acts as ,f ,t .s receivmg all zeros for the previous 

pressed data stream. " ^ requestmg data from the DIU and commence decoding of the com- 
Stream Decoder Sub-bfock Description 

««: enply space creiaed by fte taSISSlSl ""X »>»• I-' FIFO to fill «p 
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A dataflow block diagnun of the stream decoder is shown in Figure 1 14. 



DRAM fntofface Unit 1 

T r- — I — ' 




EndOfBandStDfo 



StartOSandStore 



Command Controfler 



Figure 114. Stream decoder block diagram 



□ 



24.3.6. 1 OeetKleC - Decode Command 

The DecodeC logic encodes the command from bits fi n r,r*t,. >.;» ^ 

mands: SKIP. VERTICAL and RUNLE^TH U^^ 6 -0 of the b.t stream to ouq)ut one of three corn- 
consumed, which feeds bacT to fteSSfft re^t^' "^"^ """^ "^^^ 

^ali^^S.'^l'S^'^^^J^^^^ 1 1 r 'T'^' ^ '"■^^ ^"^-^ is inferred in a 
3S amedium .SengS tSTt^ sta^ Sci^t^""!? r"""' i.e. a number less than 31. encoded 
length is decoded compIetelyTlBD^e^SJi ^R^f^^ ^I'l. ♦'^^ 
be a number of bits tha?rep.^sent un-«.SS^^^« J^Jn "^^^^th there will 

all these bits have been decoded sTc^SJ^^stS^^ f^SS_THROUGH mode until 

or the line ends, which evJZmZ^ ^' ^ * programmed number of bits is reached 

24.3.6.2 DecodeD - Decode Delta 

15 bit number. wWch is generally consideredTlS^ i ^ '^^"^ i^ * 

±!i^L!l^l^:^»^^oy^^^ ;o <">ly ^''ciress to 13824 

'^^h a 2 s complement representation of -3,-2,- 
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l^l^worlc correcUy for the data pipeline that follows. Tins unit also outputs how many bits wen. con- 
SL'Ts'^St^f U^F?m ^r"*''- ^--^^P-"^ *e bits that represent the uncompressed dataand 



and the current command. 
24.3.6.3 State-machine 



24.3.7 Command Controller Sub-block Description 

T^Z'^!^z,°nt^:^rTT' "z^™ o"< such « 
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Figure 115. Command controller block diagram 



24.3.7.1 State machine 



The following is an explanation of all the states that the state machine utilizes 
i START 

(N«. Edge U^,. ^ S^TstL^'lJi^' «. b« b«, ^ .h. «5„ 

a AWAITJUFFER 

state When i^. ^^'^^^l^S^^AZImi) TuZ^O^^T^'^r^ ^'"^ 

mand controller can proceed to the PAR^s^ NEU_RUNNING state. Once this occurs the com- 

tii PAUSEJCC 
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When in this state the command controller can xeceive one of four valid commands: 
a) Riinlengtfa or Horizontal 

mmi coMmU,r lo wJtSS Si ul n . (»°lS' ' « '■O « . lime), it Is ,,»«ss«y for fte com- 

««« siip-au s „« :f°L*^^jS',:j^ t s^«r^^ 

Vertical 

is ...i ■j^.'sLr^tss.trs i^"" •« 

Skip 

fh^^Jt^TisteTSSe^or^ «>.«u„ands but the color in the current line is not 

that the command controner frea?ft^?^o 4S;^^ "•^^^'^ "^"^ """"-'^ 

the current color in this case. """^ * ^^'o >'«'-ftc«/(0) commands and has been coded not to change 

d) Pass Through 

Tc^LLr^-^ce^t.*^^^^ ^^-'^ that is uses to construct 

LBD can recommence noln^T^t^l^.r^^t^^J!l "T""^ 

color as the last bit in un-compressed S^SS^^T^ "T'^'P'^"^^ 

command controller as each pStSiSTc^S^^iJld"? ^""-^ 

ccssed in one clock cycle. command received from the stream decoder can always be pro- 

V fK4IT_FORJ{UNLENGTH 

clock cycle the command conJX^tis ilto A. i^l^^^'f^ ^ RUNLENGTH. After the first 
LENGTH data has been ^oZZ^<^Tf^iTj'^^:^'^^'''^ ^ the i?CW- 

controller v«ll rehmi to tbePAJtSE^^ ^ " the command 

ff^Anu!-oR_/rE 



«=mams here until the edge is detea^d.'^vitd itTn" t^^^^^^^^^ ff^AIT_FOR^ state and 

return to the PAJtSE state. °^ command controUer will 
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VII FINISH^/NE 




24.3.8 



Figure 116. State diagnim for the Command Controller (CC) state machine 
Next Edge Unit Sub-block Description 



Doc: SoPEC^hardware^design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 316 



SoPEC : Hardware Design 



S5 



tinue doing this until it finds an edge or reaches Ap ^nr? *u 



Command Cortffoiter 
If 




Figure 117. Next Edge Unrt block diagram 

24.3,8.1 NEU Buffer 



preset a probllr^L^^td^^'^^^^ re««ve 1 6 bits et. time from the SFU. TOs 

ing clement in the cimeml^ *^ ^ " ^ fe«ie. but refers to a chang- 

struct the current fiame of the current line ^ « «»e«Ied from the previous bne to con- 
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ftame it needs it on the next clock cycle to maintain a decoded rate of 2 bits per clock cycle A more 
detailed diagram of the buffer in the NEU is shown in Figure 118. 



16 

uso.prev_Hne_a ^. yC. 



16 



18 



sfu_tod_pld3ta 



pLbaff,rdy_dly 



Figure 1 1 8. Next edge unit buffer diagram 

SLrJS'^df tt^^'^^f two 16<bit vectors, use^revjine^a and u..^^.//„^ 6, that are used to 
detect an-edge that is relevant to the current line being put together in the Line Fill Um"t. 

24,3.8.2 NEU Edge Detect 

The NEU Edge Detect block takes the two 16 bit vecton> supplied by the buffer and based on the current 
ime positionmthe current line. a0.and the current color..^^^^^^ 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29T50V 2002 
Page 318 



SoPEC : Hardware Design 



tl9. 



15 



16 



I 



- 



use_prev_lino_b 



IT 



transitfofuwtob 



J r 



tiansttionubtow 



19. 



coiMLneu 



19- 



19J/ <tecode_b,ei(t & decQde_b & FIRST^FLU_WRITE 




masted.data 



««»d8.b_ono,hct 



3: 



encodflLb.4bit 



bip 

Figure 1 19, Next edge unit edge detect diagram 
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Table 109. Decode Jb truth table 



immm 




0000 


1111111111111111 


0001 


1111111111111110 


0010 


1111111111111100 


0011 


1111111111111000 


0100 


1111111111110000 


0101 


1111111111100000 


0110 


1111111111000000 


0111 


1111111110000000 


1000 


1111111100000000 


1001 


1111111000000000 


1010 


1111110000000000 


1011 


1111100000000000 


1100 


1111000000000000 


1101 


1110000000000000 


1110 


1100000000000000 


1111 


1000000000000000 



'Mi^e 110. Oecode.b.ext truth table 







^1 


Verticai(-3) 


Ill 


VertfcaI(-2) 


111 


VenJcaJ(-l) 


Oil 


OTHERS 


001 



SrnfjSr^'^ ""^^ ''"^^ 2-2.5 a) in [18] refers to "Processing 
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24.3.8.3 Encode_b_one_hot 

block. * 1 n lists the truth table outlining the fimctionally required by this 



Table 111. Encode_b_one_hot Truth Table 







( «AXXXXXXXXXXXXXXX1 


0000000000000000001 1 


1 xxxxxxxxxxxxxxxxxio 


0000000000000000010 1 


J XXXXXXXXXXXXXXXX100 


0000000000000000100 j 


XXXXXXXXXXXXXXXI 000 


00000000000000O1O0O 1 


XXXXXXXXXXXXXX10000 


0000000000000010000 1 


XXXXXXXXXXXXX100000 


0000000000000100000 


1 XXXXXXXXXXXX1000000 


0000000000001000000 1 


xxxxxxxxxxxiooooooo 


0000000000010000000 I 


J XXXXXXXXXX100000000 


0000000000100000000 1 


1 XXXXXXXXX1000000000 


0000000001000000000 1 


j XXXXXXXX10000000000 


0000000010000000000 1 


j XXXXXXX100000000000 


0000000100000000000 1 


1 XXXXXX1000000000000 


0000001000000000000 1 


1 XXXXX10000000000000 


oooootooooooooooooo 1 


1 . XXXX100000000000000 


0000100000000000000 j 


j XXXI 000000000000000 


0001000000000000000 ) 


I XX10000000000000000 


0010000000000000000 1 


1 X100000000000000000 


0100000000000000000 1 


1000000000000000000 


1000000000000000000 j 


1 0000000000000000000 I 


ooooooooooooooooooo 



24.3.8.4 Encode_b_4blt 

o'^O-ti'^^sS::^'' ^^^^ ^ de-mine the add.«s 

asserted the bit location in fte vector is co^e^ to? Zl^ ^^ ""^'"'f « » 
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whJv^«J.omn^a„d is bcingprocessed The fo^^^^ 

for V(n)blp = X ♦ n moduluslg 
where x is the number that was extracted frrm, i-i,- — w ^. 

comnond. xt:ractea fr«a the -one-hot- vector and n is the vertical 



24.3.8.5 State machine 




Figure 120. State diagram for the Next Edge Unit (NEU) state machine 

The foUowing is an explanation of all the states that the /y^ECA state machine utilizes. 

i NEU START 

^ emerea it s AWAIT_PUFF state. When this occuis the NEU enters the NEU_FILL JUFF 

it NEUJ-ILL_BUFF 

Before any compressed data can be decoded the NEU needs to fill im itc !«.«•-, . w ^ ^ 
SFU. nie rest of the LBD waits while the NEUr^J^^fZ^nZ r L ^ ^""^ 

completed it enters the A^fi^cS «Se. ^ ^"""^ ^ P"^'*'"^ O'^** 

iu NEU_HOLD 
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T^U waits in this state for one clock cycle while data requested ftom the SFU on the last access 

iv NEUjtummo 

V NEV_EMPTY 

the LBD, «w»5craa. i ms occurs when the end_ofJine signal is detected from 
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2*3.9 Uiw Fill UnK sub-block description 



•ni« Line FiU Unit LFU. is responsible (or Hline the next line buffer in the SFU The sn i „ u. 

^i"t'h^r.Sr.'Ji,?.;';^."birsrsr'^°"^ 

.b« the d«.'L„iL bTs^'S '^n^7^ SPU. T»e LSD si^ to the SFU 

sTj°«2'»^?„';2:ii:J£^«"'>-™''"--»-^'"-/'-««-»^ 

A dataflow block diagram of the line fill unit is shown in Figure 1 19. 



Next 
Edge 
Unit 



Stream 
Decoder 



command contrafler 



15] 



hokl_sd_csoteif 



vmfnus.zeno 



command 



3 



State 
Machine 



line flil unit 



^ — ' — » 

4 Omtt 



ootor.seLi6blLH 



16 



16^ 



llne_fiILdata 



wort«^8fu_wdata 



tbd_siU^wdata 



lbd_sftf_wdatavaBd 



lbd_sfu,j 



advflng 



SFU 



Figure 121, Lino fill unit block diagram 
The dataflow above has the foUowing blocks: 

24.3,9.1 State Machine 

The following is an explanation of all the states that the LFU state machine utilizes. 
/ LFUJSTART 

longer zero, this only <^al^^^l^^^^^^S'J^ " ^ «0 is no 

NEU. command controUer start processing data from the Next Edge Unit. 



LFUJfEW_HEG 
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^^^^^^ 

LFUJ^EW_REC handles all the lbd_fju_y»data writes and asserts lbd^Ju_wdata^id as necessaiy. 
at LFUjCOMPLBTEJtEG 

nrriT r r conteoller supplies the aO value and the color and the state machine uses these 

to denve the lunu mdcolor_seU6bitJ/ya^ch the Unejill_data block needs to co^^^^f^, 
is the four lower significant bits of aO and color_seU6biUfi, a 16-bit wirm^S co/^e^ 
machine also maintains a check on die uoner elev«.n hit,, rjin rr ■ sa_color. The state 

A» i: eleven bits of aO. If these increment fiom one clock cvcle to 

the next that means that a frame is completed and the data can be written to the SFII fTA- ! cJ^ 




Figure 122. State diagram for the Line Fill Unit (LFU) state macMne 



24.3.9.2 Une_HII_data 



"^f -tor^eU«./U/values and consmicts the current frame that 

to^cTZti SrSe ISfe m.'^L Tr "^^ Pseudo code iUustmte the 

U "Zr »'««-«'-data. wcr*j^_Wara is exported by the LBD to the SFU as 



if (lfu_state LFU^START) OR (Ifu^state 

work_sfu_wdata « color_fiel_16bit_lf 
else 

work_6fu.wdata((15 - liinic) downto limit] = 

color.sel^l6bit_lff(i5 - limit) downto limit] 



LPU_NEW_REG) then 
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25 Spot FIFO Unit (SFU) 

25.1 Overview 

'^CVEl^^triP^'^^'^'' "^"^ ""^^ ^ '^"^'^ between the LBD and the 
HCU By abstracting Uie bufftnng mechanism and controls from both units, the interface is clean betw^ 

b^Fr^F?™ 

25.2 Main features of the SFU 

TteOTJ replaces the Spot Line BuflFer Intetfece (SLBI) in PECl. The spot line store is now located in 
TJe SFU outputs the preWous line to the LBD. stores the next line produced by the LBD and outputs the 

""^ ^^""^ ^ *^ DRAM word. vSl aheadyTe 

LBD^e'^a^-^!^*^^^''^?!"'^'^^^^ ^^^^'^ ^^^8 the &st 

c/i ^ P'^°"» « not supplied until after the first 

? « avaahAle for writing. lbd^_advline tells the SFU to adScc to the n«t ^ 

^"S-^ftr^ '^"l*' ^^-^-^^ °f P^ous line dT^ UnS Uie nSe^of 

2>ru IS available for both reading and wnting. Thereafter it indicates the SFU is available for writina tKo 

LBD should not generate it^^MaO^oni or ii^^sju^a^iine stro^^^ml%JS^^^^ 

A signal sjujicu^avail indicates that the SFU has data to sudoIv to the Hrrr a • t 

t^^^fsi^ is true. The HCU can therefore stall J^^Z 

X and Y non-integer scaling of the bi-level dot data is performed in the SFU. 

^ Utt^ ISlfori^Slf ' '^'^ ^' 3 «y<=«« i« total (read ^ read 

+ wnte) Thwefoie the SFU reqmres two 256 bit read DRAM access per 256 cycles I write access ^ 
256^cles. A smgle DIU read interfece will be shared for reading STe curxeT^d p;:^l"lSS 
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25.3 Bl-LEVEL DRAM MEMORY BUFFER BETWEEN LBD, SFU AND HCU 




high address 

> Ibd.nextflne.adr 

^ lbd_prevline.adr 
hcu_readlrne_adr 
^ hcu_$tartread(ine_adr 

low address 




high address 



^ Ibd.nextlrne.adr 

hcu_readline_adr 
^ Ibd^revljne.adr 
^ hcu.startreadline.adr 



low address 



"^y- I I Free buffer space 

RJed buffer space accessed by LBD Interfece FIFOs 
KS^ ™^ B"^^er space read by HCU Read Line FIFO 

O ™«<lB"«erspacereadbybothHCUReadUr)eFIFOaridLBD(nterfaceFIFOs 

Figure 123. Bi-level ORAM buffer 

before the HCU read line address in DRAM. previous line address reading 

The SFU inter&ces to ORAM via three FIFOs: 

a- The HCUReadLineFiFO which supplies dot data to Ae HCU 
b. The LBDNextLineFIFO which writes decompressed bi-level data fiom the LBD 
cThe LBDPr^LineFIFO which reads previous decompressed bi-level data for the LBD 
There are four address pointers used to manage the bi-level DRAM buffer- 

^.kcu_readUne_adrpi:S] is the read address in DRAM for the HCUReadLineFiFO 
"^tHS^r^n^^f/o''' " ^ «x^8 -ad by 
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''•^''-^^-^'--'^-/^'••^y is th«r«ad address in DRAM 
The address pointers must obey certain rules which indicate whether they arc valid- 

''^-^^"1^^/:// '' ^ "^'^^ - the line than 

nULadrvalid is initially inS^^^ lbd_ncxihne_adrpi:5J even though 

f. me address pointers can wny, around the SFU bi-level store area in DRAM 

can be usefi. for absorbing local c^e^iS cot r^^ bi^er^g S^"^ ^ ""^^ ™» 
DRAM ACCESS REQUIREMENTS 

vious. cunent and next line interfaces. ***** '^'"^ <»f i's pie- 

The SFU's DIU bandwidth requirements are summarized in Table 1 12. 
bandwidth requirements 




1: Two separate reads of I bit/cycle 
2: Write at 1 bit/cycle. 



25,5 SCAUNG 



factor represented by a numerator and a denominator ^i non-mteger scaling with the scale 
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or 



if (count + denominator >= numerator) then 

count = (count + denominator) ^ numerator 

advance = 1 
else 

count = count ♦ denominator 
advance - 0 

X scaling controls whether the SFU supplies the next dot or a copy of the current dot when the HCU 

S!!t^r^"J^u^^'^'^1 of Ac«_,yu_«iL signals^" t H^U 

SFU has supplied an entu« HCU Ime of data, the SFU will either re-read lie current line from DR^^ 

advance to the ne« line of HCU read data dependmg on the progranun^^rS«? 

^^^»5^"*?""T'''''r^"''**'^'"''^'*'^ = 3 113. The signal oA-once if 

asserted causes the next uiput dot to be output on the next cycle, otherwise the same input ^sZ^ 







m 


0 


0 


1 


3 


0 


1 


6 


1 


1 


2 


0 


2 


5 


1 


2 


1 


0 


3 


4 


1 


3 


0 


0 


4 


3 


0 


4 


6 


1 


4 


2 


0 


5 



25.6 LEAD-iN AND LEAD-OUT CUPPING 

SoPE? -Sf w 11 H ' r ''V'=P''««««* scale-fector number of times by an individual 

SS:oS ^^^^-"^ ^^'^y both devices doing part of the scaling, one^ 

i?„Sf 1^ o&er on Its lead m. Scaled up dots on the lead-out. i.e. which go beyond the HCU Un^ 

S:S;rc<^«^S '^-'^ " « controlled by 

St t*l*^l"rK ''Z!' ^•'"'•o^ ^ove is set to JisranCounr If there is no lead-in. JCsiart- 
STtJlSnronri^t" '^^.^.^"'^ ^^^^ Table 1 13. If there is lead-in then ^r«rrCo««r nee^ to be 
set to the appropnatc value of count in the sequence above. 
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25.7 Interfaces between LDB, SFU and HCU 



DIU 



LBO 



DIU IntarfBce 
and 
Ad<irQss 
Generator 



lbd_sfu_plac /word 



lbd.$ficacfv ne 



Previous Uno 
RFO 



.nfy 



bd_shi_wd; ta 16^ 



lbd_sfu_adv ;no 



nlf_fty 



NextUne 
FIFO 



Current Una 
RFO 



t sfu 



hey .sfu^advdot 



.hcu^sdata 



sfu, hcu^avaJF 



SFU 



HCU 



Figure 124, Interfaces between LBO/SFU/HCli 



25.7.1 LDB-SFU Internees 



The has two interfaces to the SFU. The LBp writes the next line to the SFU and reads the 
line ZTonii the SFU. 



previous 



25.7.1.1 LBDNextLineFiFO interface 

'^''^l^^^J^^f^O interface from the LED to the SFU comprises the foUowing signals- 

• /6dLj>_wdii/a, 1 6-bit write data. 5»*s"<^. 

• Ibd^Ju^wdatavalid, write data valid. 

• Ibd^s/u^advline, signal indicating LDB has advanced to the next line. 
^y^^rt^^^^ ^/^'-/W rrf. is tme. Tte LED can therefore stall waiting for the 

25.7.1.2 LBDPrevLfneFiFO interface 

^"l^^^^/tlf ''^^^^ ^^ompris^ the following signals: 

• sfiijbd^ldata, 16-bit data. ^ s*«**i>. 

'^TTf!^'^^^''^ ^'^'^^r ^'^"^ '^'^ comprises the following signals: 

/6^/_j;i/^/a^ony, signal mdicating to the SFU to supply the next 16.bi^ 

• Ibd^fh^advHne, signal uidicating LDB has advanced to the next line. 
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Previous Une data is not supplied until after the first Ibdjsju^advline strobe from the LBD (zero data 
supphed instead). The LBD should not assert lbd^/u_pladvword unless sfiijbd^rdy is asserted. 

25. 7. 1 3 Common Control Signals 

sfii Jdb_rdy indicates to the LBD that the SFU is avaUable for writing. After the first ibd^Ju^advline and 
before the number of lbd^sfu_pladyword strobes received is equivalent to the LBD line length, 
WA^ifK indicates that the SFU is available for both reading and writing. Thereafter it indicates the 
SFU is available for writing. 

The LBD should not generate ibd^sju^iadvword or Ibd^^advline strobes until sjujdb^rdy is asserted. 

25-7 J2 SFU-HCU Current Line FIFO Interface 

The interface from the SFU to the HCU comprises the following signals; 

• sjujicu_^data, 1 -bit data. 

• sjujicu^avail data valid signal indicating that there is data available in the SFU HCUReadLine- 
FIFO, 

The interface from HCU to SFU comprises the following signals: 

• hcu_^_advdot, indicating to the SFU to supply the next dot 

The HCU should not generate the hcu^Ju^advdot signal until sfujicu^avail is true. The HCU can there- 
fore stall waiting for the s/u_hcu_avail signal. 
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25.8 Implementation 



25.8.1 Definitions of lO 

Table 114. SFU Port List 



Clocks and Resets 



pdk 




In 


1 SoPEC Functional dock. 


ptst n 




In 


1 Global reset signal. 


OIU Read Interface signaJs 




sfu_diu_rreq 




Out 


SFU requests DRAM read. A read request must be accom- 
panied by a valid read address. 


sfu_diu_radff21:5] 


17 


Out 


Read address to OIU 

17 bits wide (256-blt afigned word). 


dhi_sfu.rack 




In 


Acknowledge from OIU that read request has been 
accepted and new read address can be placed on 
sfujdiujratir. 


dju_data(63:0} 


64 


In 


Data from DIU to SoPEC Units. 
Rrst 64-bits are bits 63:0 of 256 bit wofd. 
Second 644>lts are bits 127.-64 of 256 bit word. 
Third 64-btts are bits 191:128 of 256 bn wokI. 
Fourth 64-blts are bits 255:192 of 256 bit word. 


dru.sfu.rvafid 


1 


In 


agnai from DIU telling SoPEC Unit that valid read data is on 
the diujdata bus. 


DIU Write Interface srgiuiis 




8lii.dlu.wreq 


1 


Out 


SFU requests DRAM vwfte. A write request must be accom- 
panied by a vafid write address together wHh valM write data 
and a write valid. 


8fu^dfu_wadrt21:5] 


17 


Out 


Write address to DIU 

1 7 bits wWe (256-bft aligned word) . 


dhj_sfu_wack 


1 


In 


Acknowledge from DIU that write request has been 
accepted and new write address can be placed on 

sfu_jSiu_wadr, 


sfu_diu.dataI63:01 




Out 


Data from SFU to DIU. 
Rrst 64-bits are bits 63:0 of 256 bit word. 
Second 64-bJts are bits 127.-64 of 256 bit *vord. 
Third 64-bits are bits 191:128 of 256 bit word. 
Fourth 64-bits are bits 255:1 92 of 256 bit word. 


sfu_diu.wvalid 


1 


Out 


Signal from PEP Unit indicating that data on sfu diudata is 
vafid. 


PCU interface data and control signals ~ 


pcu.addrtS:2] 


4 


In 


PCU address bus. Only 4 bits are required to decode the 
address space tor this block 


pcu_dataout(3l:0I 


32 


In 


Shared write data bus from the PCU 


sfu.pcu^datain[31 rO] 


32 


Out 


Read data bus from the SFU to the PCU 


pcu_iwn 


1 


In 


Common read/not-write signal from the PCU 


pcu.sfu.sel 


1 


In 


Block select from the PCU, When pcu_sfu^sef[Q high both 
pca.adtfrand pctf.dalSaoiyrare valid 
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Table 11 4- SFU Port List 



8fujx:u_rcly 



LBD Interface Data and Control Signals 



Ready signal to the PCU. When sftunxL/oy is high ft indi- 
cates the Jast cycle of the access. For a write cyde this 
means po/.tfataorf has been registered by the block and 
for a read cyde this means the data on sfU_pcu datam Is 

vafld. " 



sfujbd_fdy 



Ibd_sfij_advline 



rbd_sfu_j>ladvword 



sfujdb_pidata[15:0] 



lbd^8tu^wdata{1 S.-OJ 



lbd_8fu_wdatavalid 



16 



16 



Out 



Out 



Signal indication that SFU has previous line data available 
and Is ready to be written to. 



Line advance signal for both next and previous lines. 



Advance word signal for previous line buffer. 



Data from the previous line buffer. 



Write data for next line buffer. 



Write data vaPd signal for next One tjuffer data. 



HCU Interface Data and Control Signals 



hcu_sfu_advdot 



sfu_hcu_sdata 



In 



Out 
Out 



Signal indicating to the SFU that the HCU is ready to accept 
the next dot of data from SFU. 



Bi-level dot data. 



sfu_hcu_avail 



Signal indicatino vafid bHevel dot data on sfu_hcu_sdatSL 
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25.8.2 Configuration Registers 

Tabfe 115. SFU Configuration Registers 



Control registe rs 
0x00 Tneset 



0x04 



Go 



0x1 



0x0 



A write to this register causes a reset of 
the SFU. 

This register can be read to indicate the 
reset state: 

0 • reset In progress 

1 ' reset not in progress 



Writing 1 to thte register starts the SFU. 
Writing 0 to this register halts the SFU. 
When Go is deasserted the state- 
machines go to their idle states but all 
counters and configuration refllsters keep 
their values. 

When Go Is asserted all counters are 
reset but configuration registers Iceep their 
values (f.e. they don't get reset). 
The SFU must be started before the LBO 
is started. 

This register can be read to determine if 
the SFU is running 
(1 - running. Q - stopped). 



1 oxoa 
oxoc ■ 


( HCUNumOots 
HCUORAMWonds " 


16 
8 


0x0000 
0x00 


Width of HCU One (in dots). 

Number of 256-bit DRAM words in a HCU 

line- 


1 0x10 
1 0x14 


LBDNumWords 


12 


0x000 


Number of 16-bft words in an LBO line. 
(LBD line length must be a multiple of 16 
bits). 




StartSfuAdff2l:5] 

(256-bit afigned DHAM address) 


17 


0x0000 
0 


First SFU location In nienK>ry. 


1 0x18 


EndSfuAdr(21:5J 

(256-bjt aOgned ORAM address) 


17 


0x0000 
0 


Ust SFU location in memory. 


1 OxIC 
1 0x20 


XstartCount 


8 


0x00 


Value to be loaded at the start of every line 
into the counter used for scaling in the X 
direction. Used to control the scaling of the 
first dot in a line. 1 
This value will typically equal zero, except 1 
in the case where a numt>er of dots are 1 
clipped on the lead in to a line. ( 




XscaleNum 


8 


0x01 


fslumerator of spot data scale tactor in X 1 
direction. | 


0x24 


XscaleOenom 




0x01 


Denominator of spot data scale factor in X I 

direction. | 


0x28 


YscaleNum 


a 


0x01 


f^umerator of spot data scale factor In Y I 
direction. j 


0x2C 


YscaleDenom 


8 


0x01 


Denominator of spot data scale factor in Y 
direction. 



Work registers (PCO has read-only access) 
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Table 115, SFU Configuration Registers 




0x34 



0x36 



Ox3C 



HCUneadUneAcJrf2T :5] 
(256»blt aligned ORAM address) 



HCUStafiReadUneAdr(21 :5J 
(25&-brt afjgned DRAM address) 



LBDNextUneAdr(21 :5J 
(2S6-btt aligned DR AM address) 

LBDPrevUneAdrf21 :5] 
(256-bit afigned ORAM address) 



17 



17 



17 



17 



Current address pointer In DRAM to HCU 
read data. Read only register. 



Start address in DRAM of line being read 
by HCU buffer in DRAM. Read only regis- 
ter. 



Current address pointer In DRAM to LBD 
write data Read only register 



Current address pointer in DRAM to LBD 
read data. Readonly register 
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25.8.3 SFU sub-block partition 



pcu_ac«rJ5:2J-t— ► 
pcu_dataout{31 :0] 

sfu_pcu_i 




LBD 



Ibd_sfi _pladv>vtKd 



,sfu. 



nlf_fdy 



Ibd.su, 



HCU 



sfu^ (to aa sub-blocks) 
Hcu_nifni_<jots 



hcu_drain_wofd8 



PCU 
Interface 



1 V t)d^num_words 



17^ stafCsfujdf 
t^^ end_sfu_adr 



hcu^readDne^adr 



17. 



hcu^startreadfine^adr 



ftxLnextflne.adr 



lbd_prevRne_adr 



-7^ 

17. 



x5tart_ooum 



xscate^num 



xscafe.denom 



yscale_num 



4^ 



17 



yscale_denom g 



Ibd_nurn_vwds 



LBD Previous 
Line FIFO 



<3: 



plf_dlufreq 



plf^dhirack 



^ptfjdftjdata 



pJf_dKjA^aSd 



pff.diiiidte 



.wdatavalid 



lbd_sfu_advrino 



fbd_num_woTd3 

^•1^ H 



LBD Next 
Line FIFO 



rWLdiuwreq 



nff^diuwack 



nff_dttjwdata 54 



ntf^cTfuwvalid 



hcu, sfu^advdot 



^ sfu 


hcu.sdata 1 







SFU 



HCU Read 
Line FIFO 



hrf_l)ou.endoffine 



hrf^xadvanoo 



fuf^diurroq 



hrf_diureck 



hrf.dkjrttata ad 



hrf_diurvalfd 



hrf_<fiutdfe 



DIU 
Interface 
Address 
Generator 
Unit 
(DAG) 



sfu_diu_wreq 

4^ sfu_diu_data[83:0] 
► sfu,dlu_wval?d 
-diu_8fM_wack 



" 9fu.diu.jreq 
|^6fu.dlujadr(21:5] 
-diu_sfu_dala(63K)] 
-diu.sfu^rvalld 
-diu,shj_rack 



Figure 125. SFU Sub-Block Partition 
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The SFU contains a number of sub-blocks: 







PCU interface 


PCU interface, configuration and status registers. Also generates the Go 
and the Reset signals for the rest of the SFU 


LBD Previous Una 
RFO 


Contains FIFO which is read by the LBD previous line Interfoce. 


LBD Next Una FIFO 


Contains FIFO %vhlch is written by the LBD next line interfece 


HCU Read Une 
FIFO 


Contains FtFO which is read Ijy the HCU intefface. 


OIU Intertace and 
Addiess Generator 


Contains DIU read interface and DIU write interface. Manages the 
address pointers for the bHevei DRAM buffer. Contains X and Y scaling 
lo^c. 



25.8.4 



The vanous FIFO sub-blocks have no knowledge of where in DRAM their read or write data is stored In 
this sense the FIFO sub-blocks are completely de-coupled from the bi-Ievel DRAM buffer All DRAM 
address management is centralised in the DIU Interface and Address Gener^on sub-block. DRAM access 
is pre-emptive i.e. after a FIFO unit has made an access then as soon as the FIFO has space to read or data 
to wnte a DIU access will be requested immediately. This ensures there are no unnecessary stalls intro- 
duced e.g. at the end of an LBD or HCU line. 

There now follows a description of the SFU sub-blocks. 
PCU Interface Sub-block 

^o^.i^^!?^'^ sub-block provides for the CPU to access SFU specific register by reading or writing 
to the SFU address space. 
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25.8.5 LBDPrevUneFIFO sub-block 



Table 116; LBDPrevLineFIFO Additional lO Definitions 





















1 


Out 


signal indicating LBDPrevUneFIFO is ready to be read 
from. Until the first &xS_sfu_a<ivtiheiot a band has been 
received and after the number of UxLshi_f^(tvword^\»s 
received for a line ts equal to LBDNumV^rds, ptf txSyXs 
always asserted. During the second and subsequent lines 
piLrdy is deasserted whenever the LBOPrevUneFIFO l^ 
empty. 


Diu and Address Generation sub-block SignaJs ~ " 


pnjd\wTeq 


1 


Out 


Signal indicating the LBOPrBvUneRFO has 256-bfts of data 
free. 


pfLdiurack 


1 




Acknowledge that read request has been accepted and 
ptLdturreq should t>e de-asserted. 


pILdiurdata 


1 


rn 


Data from the DIU to LBOPwUneFlFO. 
Rrst 64-bits are bits 63:0 of 256 bit word. 
Second 64-btts are bits 127:64 of 256 bit word. 
Third 64-bits are bits 191:126 of 256 bit word. 
Fourth 64-bits Is are 255:192 of 256 bit word. 


ptLdjunvafid 


1 


In 


Signal Indjcattng data on pfLdturdata is valid. 


prLdTuidie 


1 


Out 


Signal indicating DIU state-machine is in the IDLE state. 



2S.8.S. 1 General Description 

^^P'^J^'^^f'O sub-block comprises a double 256-bit buffer between the LBD and the DIU Inter- 
face and Addre«Gen«ator sub-block. TT,e FIFO is implemented as 8 times 64-bit words. TTie FIFO is 
wntten by the DIU Interface and Address Generator sub-block and read by Ae LBD 
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Figure 126. LBDPrevUneFifo Sub-block 



Whiaevcr 4 locations in the FIFO are ftec the FIFO wiU request 256-bits of data fiom the DIU Interface 



pclk 
plf_diurreq 
pl^diurack 
plf_diurvalid 
plC(iiurdata[63:0] 




Figure 127. Timing of signals on the LBDPrevLlneFIFO Interface to DIU and Address Generator 
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diuidfe = 1 



2SSd2itBiffifiJ[a£l£Q 



^ Request ^ diurreq = 1. dtuldle =0 
^ d!uraelcg=i 

^ Ack ^ diurreq sO 

^ dlurv and==.1 

^ DataO ^ 



diurvaffd^-f 



^ Datal ^ 



ditjrvar>d==-f 
^ Data2 



diijrv alM='^ 

Data3 ^ 



Figure 128. Timing of signals on LBDPrevLlneFIFO interface to DIU and Address Genei^or 

JIf ^^^fw^^J^"^'* "^^^ ^"""^ ^OPrevLineFFFO on sfii lbd_pldata[15 0] 

llnTSf ^DP^Wf-ZFO to supply the next 16-bit w^rd. TTie FIFO 

control lope gen^ates a signal W^efecf which selects the next IS-bits of the 64.bit FIFO word to out- 

Td ^'^-^J^^^^^ ^-^^^ '^'^^^^ word has been n««i by Se lTd 

ibd^^ladvword wUl cause the next word to be popped from the FIFO. 

!l!ZIf l^"" ^"^j^ ''''' ^""P^^"?? "^"^^ Ibd^fii^advline strobe from the LBD after ^/u^o is 

TIT^ }T ff' u ^"^^ ^« /&dL.yu.acft./me strobe after .^H. 

/W.^_pto<fv»vort/ strobes are ignored •9"_gi' 

■Tbe I^DPr^UneF/FO control logic uses a counter. pladvword_count[1 1:0]. to coimts the number of 
S^i^-^^ ."^.^ '""''r'^ "^^ /'/'«'vv^-^--ounr counter is reset to 0 by 

LBDNulAWo^. °^ tbd_sJujladvword strobes received is equal to 

T^n^^l^^^u ^ ^^^-^''^ '° that it has data available. Until the first 

lbd_ffii_advline for a band has been received and after the number of tbdjfi,_pladv^rd strobes received 
for a hne ^ equal to LBDNumm>rds, plf_rdy is always asserted. During the ^cond and Se 
plLrefy IS deasscited whenever the LBDPrevLineFIFO is empty. oscqucnt unes 

21^^^)^?^^" ^^'^ '^^^ V ^^'^^ P^*^"8 ^••i^h «ho"W not be output to 

toe LBD. This >5 because lbd_num_words may not fit exactly into a 256-bit DRAM word. When the count 

Xi«eF/FO must adjust the FIFO read address to point to the next 256-bit word boundary in the FIFO This 
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Tns of15i."te ^'Z'"^/'"^^' '^-'^f^^OJ. will require 3 bits to address 8 loca- 

if (pladvword_count — lbd_num_words) then 
read_adr[l:OI bOO 
read^adrt2] = -read.adr(23 

25,8.6 LBDNextLineFIFO sub-bJock 

Table 117. LBDNextLineFIFO AddWonal lO Definition 









— 


ftlf^rrfy 


1 


Out 


Signal indicating LBDNextUneFIFOlB ready to t>e written to 
i.e. there Is space in the FIFO. 


OIU and Address Qeneratlon suli 


Hbloclr Signals ' ' 


nJt.d(uwreq 


1 


Out 


SignaJ indicating the LBDNexUJneFIFO has 256-bits of data 
for writing to the DIU. 


nIf.diuwacK 


1 


In 


Acknowledge from DfU that write request has been 
accepted and write data can be output on ntf diuwdata 
together with n^dinuwva/^. 


ntf_diuwdata 


1 


Out 


Data from LBDNextUneFIFOto DIU Interface 
Rrst 64-Wts is bits 63:0 of 256 bit word 
Second 64-blts is bfts 127:64 of 256 bfl word 
Third 64-bits is bits 191 :128 of 256 bit word 
Fourth 64-bits is hits 255:1 92 of 256 bft word 


nlfjdiuwvarid 


1 


In 


Signal indicating that data on wtLdiuyvdata is vaild. j 



25.8.6.1 Generat Description 



? ^ sub-block composes a double 256.bit buffer between the LBD and the DIU Inter- 

face and Address Generator sub^block. The FIFO is implemented as 8 times 64-bit words The FIFO U 
wntten by die LBD and read by the DIU Interface and Address Generau,r 
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sfu.vwdata_fog 



M_sfiJ_wdata 



16 





64 


> 





wof<J_6efect / 



tod_sfu_advar>e 



IbdjnuTTUworda 
^ nif_fdy 



64 



8 word 
64-bit FIFO 



wnte 



read 



write_adr 



/ ^3 read^adr 



64 



—J^ — ► nH_diuwdata 



HFO control 
logic 



nfljdiuwreq 



nff_dhjwack 



ii!f_dlLnwaRd 
— ► 



Ffgure 129. LBDNextLineFifo Sub-block 

S^^I ^^Th?'^ ? 256.bits of data to be written to the DIU 

K^^^^^^^^^ ?;r" ^\-«^«,J;:«5-^--^- A signal «//.^/«w«:.indicarSSe 



pcfk 
nJL.cliuwreq 
nJf_diuwack 
nlf_wdiudata[63:0] [ 
nlf_diuwvalfd 




1 H 2 1 3 I 4 



] 



J 



Figure 1 30. Timing of signals on LBDNextUneRFO Interface to DIU and Addrass Generator 
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5 



Idle 



3 



256.bfte in FlPn 



^ Request ^ diuwieq^ 1 

^ <fimaok=a| 
r Ack J diuwreq » 0 



^ DataO~^ dhjwvalkfal 



^ Patal ^ 



^ DataZ ^ 



Data3 ^ 



ditnwvalidsl 



cUuwvalid=1 



(Suwvaliclsl 



Figure 131. LBDNexttJnoFIFO OIU Inteiface State Otagram 

IJilJirLf^^^'^r'^ the has space for writing by the LBD. TTxe LBD 

vrnttt 16-b.t wide date suppbed on tbd^s/u_y^[l5:0J. lbd_;^_y^aM indicates that the data is valid. 
The data is coUected to make up a 64-bit word before being written to the FIFO. 

The LBDN^LineFIFO^M logic counts the number of Ibd^/u^wvalid signals. The no-a/W 



25.8.7 sfu_lbd_rdy Generation 



^D^^^f^ " ''^ LBDPr^UneFIFO and n(Lnfy fiom the 

LBDNextUneFlFO After the first lbd_fju_advline and before the number of lbdjJuj>ladvword strobes 

rZi'f^ir'"'?,! '°r?n!r' i"'^,/^-"'-'* that the SFU is aveSif for bo^h reS^ 

j^^ere .s data m the LBDPrevUneFlFO. and writing. ITiereafter it indicates the SFU is available fc^ 
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25.8.8 LBD-SFU Interfaces Timing Waveform Description 

ilTT*"/ 'P- ^-'^V^ "^^'^^^ ^ SFU that registers the data written from the LBD «nd 

The main points to note from Figure 132 are* 

last available location available in the FIFO in the SFU. In cfo* <Sir4 tS^ 
LBD has entered a pause mode and waits for sfu_Ibd_rdy to be asserted again ^ 

o^^ti^ii by JSSi^ T/L^':^^'' again. The LBD detects this and on clock cycle 8 it starts 
in dock5,de '^^fi^-y'data^ul and putt«« new data out which is registered by the 

not'bc^.""" -l^ch should be highlighted. On examination this tun. 

Scenario 1: 

tj'^-^J^^^J" 1°^ ^ ^ «i" 1 piece of data in the FIFO If th«« J, , 

iW^A.^W«Wpulsein.henextcyclethedatawUlappeJon5y«J^^^ If there « a 

Scenario 2: 

sjujbd_nfy will go low when there is stiJl 1 niece of data in thp^ prFn t^*u^-^ - » _r ^ 

sfitjbd_pldata[ir0j °""^'^-J^-^^a assert again, and so the data will appear on 

Scenario 3: 

/£r'*''-''f'v'^ J° "^^"^ *^ " « stiM 1 piece of data in the FIFO If thc«. n« 
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Read from SFU to LBn 




8 I 9 I 10 

Ftguro 132. Signal waveforms between LBD and SFU 
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25.8.9 HCUReadLlneFIFO sub-block 

Table 118. HCUReadUneFIF O Additional lO Definition 

PIU and Address Generation sutMalock Signals 




hrOcadvance 


1 


rn 


Signal from horizontal scaling unit 
1 - supply the next dot 
1 - supply the cunent dot 


hdLhcuendofline 


1 


Out 


Signal lasting 1 cyde indicating then end of the HCU read 
One. 


tirf.diurreq 


1 


Out 


Signal indteating the HCURaadUneFtFO has space for 256- 
bits of DIU data. 


hrf.diurack 


1 


In 


Acknowfedge that read request has t>een accepted and 
/7nr_df(/nrBqr should be de^asserted. 


hrf^diurdala 

hKLdhirvalid 
hrLdluidie 


1 

1 j 
1 { 


m 

In 

Out 


Data from HCUReadUneFIFO Xo DtU 

Rrst 64-blts are bits 63:0 of 256 bit word. 

Second 64-bits are bits 1 27.-64 of 256 bit vrord. 

Third 64-bits are bits 191 :1 28 of 256 bit word. 

Fourth 64-bits are bits 255:1 92 of 256 bit word 

Signal indicating data on plf^diurxSata is valid. 

Signal indicating DtU state-machine is in the IDLE state. 



25.8.9.1 General Description 

^^^^^^T"""^ "«"Pri«« « <touble 256.bit buffer between the HCU and the DIU 

^"^^ « implemented as 8 times 64-bit woX Tte HFO 
IS vmtten by the DIU Inter&ce and Address Generator sub-block and read by the HCU 



LBD 
sfu_he((_sdata ^ 



/I 



>4 



64 



btt_s©lect^ 



rsad.adr 



read 
T— 



8 word 
64-bit FIFO 



wnte 



64 



-hrfjdiurdata 



write.en 



.'3 



write^adr 



tKu_sfu.advdot 



bcu_nun\_dols 16 



hrf^xadvance 



hff_hcu,efictoflino 



FIFO control 
logic 



hriLdiurreq 



hrf.dhirack 



hrf_dfufvaBd 



Figure 133. HCUReadUneFifo Sub-block 
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The DIU Interface and Address Generation (DAG) sub-block interface of the HCUReadLineFlFO is idea- 
tical to the LBDPrevUneFIFO DIU interface. 

Whenever 4 locations in the FIFO are free the FIFO will request 256-bits of data from the DAG sub-block 
by asserting hrf^diurreq, A signal A*:/L<^iMnicA: indicates that the request has been accepted and hrfdiurreq 
should be de-asserted. 

The data is written to the FIFO as 64-bits on hrf_diurdata[63:0J over 4 clock cycles. The signal 
hrf^diurvalid indicates that the data returned on hrfJliurdata[63:0J is valid, hrfjiiurvalid is used to gen- 
crate the FIFO write enable, write^en, and to increment the FIFO write address, write_adrf2:0J. If the 
HCUReadLineFlFO still has 256-bits free then hf^diwreq ^ould be asserted again. 

The HCUReadLineFlFO generates a signal sjujicujivail to indicate that it has data available for the 
HCU. The HCU reads single-bit data supplied on sfujicu^data. The FIFO control logic generates a sig- 
nal bit^elect which selects the next bit of Ae 64-bit FIFO word to output on sjujicu_sdata. The signal 
hcu^s/u^advdot tells the HCUReadLineFlFO to supply the next dot ihfjcadvance'=- 1) or the current dot 
(hrfjxadvance - 0) on s/ujtcu^data according to die kr^jcadvance signal from the scaling control unit in 
the DAG sub-block. The HCU should not generate the hcu^fujadvdot signal until sfiijicu avail is true. 
The HCU can therefore stall waiting for the sfiijicu^avail signal. 

When the entire current 64.bit FIFO word has been read by die HCU kcu_sfii^advdot will cause the next 
word to be popped from the FIFO. 

The last 256-bit word for a line read from DRAM and written into the HCUReadLineFlFO can contain 
dots or extra padding which should not be output to the HCU. A counter in the HCUReadLineFlFO, 
hcuadvdot_count[25:0l counts the number of hcu_sfi4_advdot strobes received from the HCU. When the 
count equals hcu^num_dotsfJ5:0J the HCUReadLineFlFO must adjust the HFO read address to point to 
the next 256-bit word boundary in the FpO. This can be achieved by considering the FIFO read address, 
read_adr[2:0], will require 3 bits to address 8 locations of 64-bits. The next 256-bit aligned address is cal- 
culated by inverting the MSB of the read^adr and setting all other bits to 0. 

If (hcuaavdot.count » hcu_nuaLjdots ) then 
read.adr ( 1 : 0 ] s bOO 
read_«drt2] = *rea4_adr[2] 

The DIU Interface and Address Generator sub-block scaling unit also needs to know when 
hcuadvdot_count equals hcu^num^dots. This condition is exported from the HCUReadLineFlFO as the 
signal hrfjicuendofline. When the hrfjicuendofline is asserted the scaling unit will decide based on verti- 
cal scaling whether to go back to the start of the current Une or go onto the next line. 

25.8.9.2 DRAM Access Limitation 

The SFU must output 1 bit/cycle to the HCU Since HCUNumDots may not be a multiple of 256 bits the 
last 256-bit DRAM word on the line can contain extra zeros. In this case, the SFU may not be able to pro- 
vide 1 bit/cycle to die HCU. This.could lead to a staU by the SFU. This staU could then propagate if the 
margins being used by the HCU are not sufficient to hide it. The maximum stall can be estimated by the 
calculation: DRAM service period - X scale fector * dots used from last DRAM read for HCU line. 
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25.8.10 DIU interface and Address Generator Sub-block 



JQ^>^ P'^ Interface and Address Generator Additional lO Description 









iniemoi i.Durrevi.ineMr-0 inputs * 


plf.diurreq 




In 


Signal indicatino the LBDPrevUneFIFOhas 256-bits of data 
free. 


plf.cfjurack 




Oiit 


Acknowledge that read request has been accepted and 
pILdiufTBq should be de-asserted. 


pILdiurdata 




Out 


Data from the DIU to LBOPrevUneRFQ. 
Rrst 64-bit8 are bits 63K) of 256 bit word 
Second 64-brts are bits 1 27:64 of 256 bit word 
Third 64-bits are bits 191:128 of 256 bit word 
Fourth 64-btt5 are bits 255:192 of 256 bit word 


plOliurrvalid 




Out 


Signal indicating data on p/Ld/urd^ta is valid. 


prf.diukOe 




In 


1 Signal indicating DIU stata^chine is in the IDLE state. 


internal LBDNextLlneFIFO Inputs — 


nlf.diuwreq 




In 


Signal indicatng the ISOMaxfL^neF/FOhas 256-blts of data 
for writing to the DIU. 


nlfjdjuwack 




Out 


Acknowledge from DIU that write request has been 
accepted and write data can be output on nff_diuwciata 
together w\tt\ nif diuwvalid. 


nlfjdiuwdata 




In 


Data from LBDNextUneFIFO to DIU interface. 
Rrst 64-bits are bits 63:0 of 2S6 bit word 
Second 64-bIts are bits 127:64 of 256 bit word 
Third 64-btts are bits 191:128 of 256 bit word 
Fourth 64'bits are bits 255:192 of 256 bit word 


nILdluwvalid 




In 


Signal indicating that data on wtfjdiuwdata Is vaTid. 


Internal HCUReadUneHFO Inputs 


hrf.hcuendofline 




(n 


Signal lasting 1 cyde indicating then end of the HCU read 
line. 


hif^dvance 




Out 


Signal from horizontal scaGng unK 
1 - supply the next dot 
1 - supply the current dot 


hjfjdiurreq 




In 


Signal Indicating the HCUReadUneRFO has space fbr 256- 
bftsof DiUdata. 


hrf^dlurack 




Out 


Acknowledge that read request has been accepted arKi 
hrfjdiuneq should be de-asserted. 


hrfjdiurdata 




Out 


Data from HCUReaaUneFiFOio DIU. 
Rrst 64-bits are bits 63 :0 of 256 bit word 
Second 64-blts are bits 1 27:64 of 256 bit word 
Third 64-bfts are bits 1 91 :128 of 256 bit wofxl 
Fourth 64-bits are bits 255:192 of 256 bit word 


hrf.djurvafid 




Out 


Signal indicating data on plf_diurdata is valid. 


hrfjdiuidle 




in 


Signal indicating DIU state-machine is in the IDLE state 
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25.8. 10. 1 Genera! Description 

The DTU Interface and Address Generator (ZM<7) sub-block manages the bi-level buffer in DRAM If »,« » 

r^ad^=r^^^^^^^ 

1^ bToTJrK" ^ ' ^"'^^f aon-iateger scaling logic is completely contained in the DAG 

St, whilh?! !°T' "T"* Mf^ance signal to the HCUReldLineFIFO wwS SdT 

cates whether to lephcate the current dot or supply the next dot for horizontal scaling. 

2£8.fa2 0/(/ (Vn'fe Interface 

J^^..^^^f%^f^O g^tes all the DIU write interface signals directly except for 
*>_rfm_»vfldr/2/. 5/ which is generated by the Address Generation logic ^ 

^dt!:^^ ~ ™P'*'"«»»^o» e-^ure that no erroneous requests occur on 



nlf_dtuwreq 



nfLadrvaiid. 



& 



^ nlt.diuwaci( 



nif.diuwdata 64 




ntLdftrnvaUd 



4 Writ*- — 




sfu_diu_da!atQ3;0) 
— ► sfuLdtu^wvalld 



Figure 134. DIU Write Interface 



25.fi. 10.3 DIU Read Interface 



^i>Pr^i:i/.eF/FO sfiare the read interface. If both sources request simuN 
I?(^^iom IT^^^'^ff^'^^MX * select_hrjplf. which indicates whether the DIU 
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serecLhrfplf 




hrf_dlurdata 



pif.dlurdata 



prUdlurvalkt ^ 
hfCdlurvalld < 



4 7*— dlujBfu.data[63:0] 



<fiu_sfu_rvaBd 



Figure 135. DIU Read Interface muitJprexfng by seiect_hrfptf 

Sue'L'^r^ ^'^'^ ^ ^i^^o^ »osi^ a DIU read 

rln .^f^ ^ or/;// ^lurre^ and assert sju^diu^rreq which goes to the DIU. The accompanying 

H ^^t ^ j?.^'^^^^ Generation Logic. TTie select signal select hrfplfj^ll, 

^miZ ^<^^/>'^ie<^g«^.the request on diu^s/u^t Arbitration cannot take plC^ ag^in i^S 

to ensure that the Dm read data IS multiplexed back to the RFO that rcq^^^ snccessary 



hrf_<fiuwreq 



IvLadrvaUd 



pfLcfhiwreq 



pif,adrvalld. 



& 



& 



diu^sfu.rack . 
diu_idle 



Read Request 
Arbitzation Logic 



2 

history 
> 



busy 
> 



^ selecUhrfplf 
^sfu_diu_rreq 



Figure 136. DIU read request arbitration logic 
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The DIU read requests from the HCUReadLineFIFO and ISDPrevUneFlFO will be nea,t.H f • 

respective addresses in DRAM are invalid, hrf adrvalid- 0 or/,// adrv^lo iZ , 

ensure that no erroneous requests occur on sfiTdiuJreq '"^'^-'^"''^ °- ^« ""Plementation must 

A pseudo-code description of the DIU read arbitration is given below. 

U innLl:is:.°o„'rrrer' «^-<»"-"«>. Pl^ is LBDPrevLWlFO 

select_hrfplf = 0 // default choose hrf 

history = none // no Diu read acc^as immediately precedino 

if Idi^idL f/rr:hr '•"'-~<=»*^~ «^te then de^a».e.t busy 

busy 0 

i'^"d!SS:f^^ L^rr^nen^*^ - — - — t 

//de-assert request in response to acknowledge 
sfu_diu_rreq =0 

// if not busy then arbitrate between incoming requests 
// xf request detected then assert busy ^«sts 
if (busy e= 0) then 

//if there is no request 

if (hrf^diurreq 0) AND (plf_diurreq == O) then 

sfu_d.iu^rreq a 0 

history « none 
// else there is a request 
else C 

// assert busy and request Diu read access 
busy ti 1 

sfu_diu_rreq a 1 

// if o^^^'^v^™ round-robin £a,hion between the requestors 
it ,1 HOnieadLinePIFO requesting choose HCUReadLineFIFO 

If (hrf_diurreq -= 1» AND (plf_diurceq o> then 

history » hrf 

6elect_hrfplf « 0 

^f tLriT requesting choose l^DPrevLineFIFO 

If {hrf^diurreq O) AND (plf.diurrcq 1) then 

history = plf 
select_hrfplf = 1 

^^^fn^''^"^^''^^''^^^''^ ^''^ LBDPrevLineFIFO requesting 
if (hrf.diurreq == 1) AND (plf.diurreq =t= i, then 

if 7hl^^^*''^^'' preceding request choose HCUReadLineFIFO 
it (history == none) then 

history « hrf 

eelect_hrfplf « o 

eis-f ThLl::^ Winner^ was^HCUReadLineFXFO choose I^OPrevLineFXFO 

history = pif 
select_hrfplf = 1 

el'slf T^TsilZ :='^n,":her'"^"""""° '='"'^'« HCOReedLineFIPO 
. history =r hrf 
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25.8.10.4 



Si 



select_hrfplf = o 
// end there is a request 



> 



Address Generation Logic 
The DIU interface generates the DRAM addresses of dat. ,ead and written by the SFU's FIFOs 



sfu_go 



17^ start_8hi_adr 


17^ ood_sfu_adr 






» 


8 ^ hcu dram words 






» 


hrf_diurack 




rULdfuwack 


ptf.diuradc 






1 


«xLsfti,jBdvlin0 







17^ 


8fu_dhi_radfl21:51 






sfu^diu.«vadif2l:5] 


— > 




hcu_roadRne_adr 






hcu_staftreadlino_adr 






txl_r)8xtline.adr 


— ► 




Hjd_prBvlfna_adf 






hff_adfvalid 


— ► 




Mf_start.8drvand 






nlLadrvafid 






pIf.adrvaHd 








Figure 137. Address Generation 

The address generator is configured with the number of DRAM ^ ^ . 

A«.rfr««_M«.^/7.<)y.thefir«DRAMaddn^^ ^ 

■ddiess of the SFU area, end_fJU_adr/2J:SJ " area. start_^_adrf2t:5], and the last DRAM 

Address Generation 

There are four address pointers used to manage the bi-level DRAM buffer 

a. hcu^eadline_adrpj:SJ is the read address in DRAM for the HCUReadLineFlFO 
tJr^Sr^2>^^'-'-' ^'^^ ^ --nt line being read by 

c. lbd_ne^Um_adrpj:SJ is the write address in DRAM for the LBDNe>ciUr,eFIFO 

d. /W^/«v/fae_fldr/2/. isthe readaddress in DRAM for the LBDPrevLineFIFO ' 
Tlie current value of these address pointera are readable by the CPU 

*>• Mfj5tart_adrvalid, 
c nlfjidrvalid. 
<f- plfjoidrvalid. 
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DRAM i^uests torn the FIFOs will not be issued to the DIU until the appropriate address flag is valid 
Rules Tor address pointers 

The address pointers must obey certain rules which indicate whether Aey are valid- 
^hai_rea<aine_adr[2I:SJ is only valid if it is reading earlier in the line than 

lbd_nextline_adr[21:5J !''hcuj!tartreadline_adr[2I:5J. '»*rvaiia 

^'H^Alf^t^^^"*^"^^^'^'- ^'^^ ^ LBDPrevLineFIFO is reading 

lbd_nextltne_adrpi:5J /= lbd_pre.Une_adr[21:SJ AND hcuj^tartreadline' valid. 

*^2'ii^^7^2f"''^,^'" "^'^^ *° ^^-^ l^DNextUn^FO is writing / e 

plf^juirvalid - lbd_prevUne_adr[21:5J /= lbd_nextline_adr[21:5]. 

o.At ^ntup i.e. when ^y«_^o is asserted, the pointers are reset to jrarr adrm -51 Th« fi«# 

f. The address pointcis can wrap around the SFU bi-level store area in DRAM. 
X scaling of data for HCUReadLineFIFO 

should supply the current dot hrfj^ance is 0 the HCUReadUneFIFO 



-4^ 


xstartjoount 






xscaJe_num 


► 




xscafe^denom 


— ► 






— ^ 




twf_hcu_endo(rme 






'icu_sfu_adwtlot 


— ^ 



X Scalbg Control 
Unit 



Figure 138. X scaling control unit 

if (hcu_8£u_dotodv == 1) then 

if (x_scale_count * x_sc«le_denom - x^sc«le_num >. 0) then 

else 

X_scal^count = ;xL.scaae_count + x^sc«le_denom 
nrc.xadvance 9 0 

else 
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x_scale_count = x_scale_count 
hrf_xadvance * 0 



Y scaling of data for HCUReadLineFIFO 

if (hrf_hcu_€ndofline == i) then 

if (y^scale^count + y.scale.denom - y ecale num >- oj ^>,<.« 
y^scale.count . y.scalc.count ♦ y seal e STZn f 

else 
else 

y_scale_count = y_scale_co\mt 
J"^f-yadvance = 0 





yscale.num 


» 




. hrf_yadvanod 




y8cate_denom 
hfChcu.endoffine 


» 


Y Scaling Control 
Unit 




► 


P 









Rgum 139. Y scaling control unit 

Offset = hcu_startreadline_«dr . end_efu adr 
If (offsat >- 0) then 

^ hcu_at«rtreadliae_adr - atart_sf«_adr * offset 
hcu_readline_adr =■ hcu.otartreadli„e adr 

«ci^_reaaiine_adr « hcu_startreadaine_adr 



Figuit 140 shows an overview of X and Y scaling for HCU data. 
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hcu_startreadllne.a<ir hcu_readHno_adf 



start of next hou fine in DRAM & 

hcu^startreadllne_ftdr + hcu_dfain_word5 



ORAM 




When DRAM reads for Una 
are comptete advance tp next 
fine or return to start of current 11 
Ji ^ acc ording to Y-scafin g. 

Y-scale ^ 
logic J 



hcij_sfu_advdoi 



1 dot count 1 
J frouiHCU 


tvf.xadvance 


256 bits 




256 bits 









■^"^ X-scalc 

logic J 



*xaj_sfu_advdot 



I sfu_heu_sdata 
HCUReadl^ineFlFO 



Figure 140. Oveiview of X and Y scaling at HCU interface 
Address generator pseudo-«ode: 
Initialization: 

if (sfu^go rising edge) then 

//set flag to allow first write 

init = 1 

lbd^nextline.adr[21:5) = start_sfu^adrf21 :51 

hcu^readline_adrC21:5] start_8fu.a<Jr(21 • 51 
,,.*l^-^^*^^^«'^<^i«^^-<i^C21r5) = start.sfu.ai:(2i:SJ 
//if fxrst write con^lete ^'I'Ci.aj 
elslf (plf_adrvalid == i) then 

// reset flag allowing first write 

init e 0 



Address valid signals: 

hrf^adrvalid = hcu_readline^adr != Ibd nextline adr 
hrf.startadrvalid = Ibd.nextline adr iZ hcu_a t^^trLrfu « a 
nlf^adrvalid . init OR ( ( lbd_nextl ine adr^ i ^ 

Plf^adrvalid . l^^r.^lL.J.^^ J^l^ ^ hrf.startadrvalid) 

Address pointer updating; 

/ /LBDNextLinePIFO 

//if DIU write acknowledge and LBDNextLinePiFo address is valid 
if (dxu_sfu_wack « 1 AND nlf.adrvalid) then ^^""^^^ 
//if end of SFU address range 
if (lbd_ne:Ktline,adr =x» endLsfu.adr) then 
//go to start of SFU address range 
lbd_nextline„adr = start sfu_adr 
else 

//increment address pointer 
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lbd_nexcline_«dr > lb<J_nextlinG_adr * i 
// LBDPrevLineFXFO 

Ibdjprevline^adr = start_sfu adr 
else ~ 

Ibdjrevline^adr « Ibdjrevline^adr + i 
// HCUReadLinePIFO 

x'f "d?J!.frr.:^'^J12: H^-'^^noFIFO adcte.s is valid 

off^t - hcu_»tan:readline_aar - en<J_sfu adr 
if (offset >» 0) Chen 

^ hcu_scortre«dline_adr = start_sfu_adir * offset 
hcu readline_fldr - hcu_startreadline_adr 

hcu.readXi„e.«dr = hcu^startreadlinel^^ 
; address apace 

" y/ao " end.sfu_adr) then 

//go to start of SHJ address space 
hcu_readline_adr « atart^sfu adr 
else ~ 

//increment address pointer 
hcu_readline,adr « hcu^readline^adr + i 
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26 Tag Encoder (TE) 



26.1 Overview 



in tl» XJInaioa Thus, Z ^^^^^f^JlT^ ^ "''I"* ««lm8 
•luenUy^dedupt. 16004J. ^ ^ " l^oluoons Ins »«, 1600 which 

can be subse- 

...^»o.,D«^,,„h«h,rLs™r:^'^i:'crzri^pK 



ORAM 
interfeioe 



Oficoder 



PCU 



tag FIFO 



halftoner/ 
compositof 



te^fintshedband 



Figure 141. High level block diagram of TE in context 

provided on oefset-pr^ed pa^ usi^bl^k^^^ ^ •*f«' I'^^^d functionality can be 
encode buttons. AlteiiiativelyS^Sle on others blank areas of the page - for example to 
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26.2 What ARE TAGS? 



SlJf^fil^Z^ r ^ " '^^^'"'^ S"^*'- ""'^ fl^'y P^»««ed in 1952 

(US Patent 2,612.994) when electronic parts were scarce and very expensive. Now however, with the 
«Ivent of ch^ and readily available computer technology, neariy every item purchased from a shop con- 
tains a barcode of some description on the packaging. From books to CDs, to grocery items, the barcode 
* <»nv«iient way of identifying an object by a product number. The exact interpretation of the 
product nuniber depends on the type of barcode. Warehouse inventory tracking systems let users define 
then- own product number ranges. whUe inventory in shops must be more universally encoded so that prod- 
ucts from oiie company don't overlap with prodiicts from another company. Universal Product Codes 
Svety^o^* '° "^'^ ^^^'^'^ national Association of Food Chains for 

cZt^'^uT^'^- '!f^° "r^^"^ ^ ^ ^"^^ fonnats contain 

charactMs tha^ are displayed m the form of lines. The combination of black and white lines describe the 
mfonnation Ac barcodes contains. Often there are two types of lines to form the complete barcode- the 
chaiactCTs (the mformation itselO and lines to separate blocks for better optical recognition. While the 
^tr.fr from barcode to barcode, the lines to separate blocks stays constant. The lines to 

separate blocks can therefore be thought of as part of the constant structure components of the barcode. 
Barcodes are read with specialized reading devices that then pass the extracted data onto the computer for 
further processmg. For example, a point-of-sale scanning device allows the sales assistant to add the 
'^"l!^ *^ o« a «^lay device for verifica- 

re^ ae biS^dS'"'* ^ scannets. slot readers, and cameras are among the many devices used to 

To help ensure Uiat the data extracted was read correctly, checksums were introduced as a crude form of 
^'Jf^f°- ^""^ Aztec 2D barcode developed by Andy Longacre 

schemes such as Reed-Solomon. Reed Solomon encoding is adequately discussed in [24], [26] and [301 
TTie reader ^ advised to refer to these sources for backgnnmd information. Very often the degree of redJ- 
dancy encodmg is user selectable. s «^ twmu 

More recently there has also been a move from the simple one dimensional barcodes Hine based) to two 
dimensional barcodes. Instead of storing the information as a series of lines, where the data can be 
extracted from a single dimension, the information is encoded in two dimensions. Just as with the original 
bareodes. the 2D barcode contains both infonnation and structural components for better optical lecSni- 
non. Figure 142 show an example of a QR Code (Quick Response Code), developed by Denso of Japan 
OJS patent number US5726435). Note the barcode ceU is comprised of two areas: a data area (depenS^ 

^^.^^ u'^^^'^f^' ^'^ "^"^^ P^^''^^" P^«™- constant position 

detection pattern is used by the reader to help locate the cell itself, then to locate the cell boundaries to 
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allow the reader to detennine the original orientation of the cell (orientation can be determined by the fact 
that there is no 4th comer pattern). 



21 blocks viMe 




position detection 
pattern 



data area 



Figure 142. Example QR Code developed by Oenso of Japan 

The number of barcode encoding schemes grows daily. Yet very often the hardware for producing these 
barcodes is specific to the particular barcode fomiat As printers become more and more embedded there 
is an incrcasmg desire for real-time printing of these barcodes. In particular, NcQjage enabled appUcations 
requu^ the pnntmg of 2D barcodes (or tags) over the page, preferably in infta-red ink. The tag encoder in 
SoPEC uses a genenc barcode fomiat encoding scheme which is particularly suited to real-time printing 
Since the barcode encoding format is generic, the same rendering hardware engine can be used to produce 
a wide vanety of barcode formats. 

Uiifortunately the term •%arcode" is interpreted in different ways by different people. Sometimes it refers 
only to the data area conqjonent, and does not include the constant position detection pattern. In other 
cases It refers to both data and constant position detection pattern. 

We therefore use die term tag to refer to the combination of data and any other components (such as posi- 
bon detection pattern, blank space etc. surround) that must be rendered to help hold or locate/read the data. 
A tag therefore contains the following components: 

• data ai«i(s). The data area is the whole reason that the tag exists. The tag data area(s) contains the 
encoded data (optionally redundancy-encoded, periiaps simply checksummed) where the bits of the 
data are placed within the data area at locations specified by the tag encoding scheme. 

• constant background patterns, which typically includes a constant position detection pattem. These 
help the tag reader to locate the tag. They include components that are easy to locate and may contain 
onentobon and perspective infomiation in the case of 2D tags. Constant background patterns may also 

. include such patterns as a blank area sunounding the data area or position detection pattem. These 
blank patterns can aid in the decoding of the data by ensuring that there is no interference between tags 
or data areas. ^ 

In most tag encoding schemes there is at least some constant background pattem, but it is not necessarily 
required by all. For example, if the tag data area is enclosed by a physical space and the readmg means 
uses a non-opbcal location mechanism (e.g. physical alignment of surface to data reader) then a position 
detection pattem is not required. 

Different tag encoding schemes have different sized tags, and have different allocation of physical tag area 
to constont posihon detection pattern and data area. For example, the QR code has 3 fixed blocks at the 
edges of the tag for position detection pattem (see Figure 142) and a data area in the remainder. By con- 
bast, the Netpage tag sbuchire (see Figures 143 and 144) contains a circular locator component, an orien- 
tobon feanire. and several data areas. Figure 143(a) shows the Netpage tag constant background pattem m 
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form a block within the data SS represented by many physical output dots to 




<a) Naipage tag backgnMnd pattern 




(b) Netpage tag ahowing data i 
figure 143. Ntttpage tag structure 




26.2.1 



F.Bure 144. Netpage tag with data rendered at 1600 dpi (magnffted view) 
Contents of the data area 

The data area contains the data for the tag 
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mng resolution. For example, in the QR code (see Figure 142). a single bit is represented by a dark module 
or a light module, where the exact number of dots in the dark module or light module depends on the ren- 
denng resolution and target reading/scanning resolution. For example, a dark module may be repzesented 
by a square block of printed dots (all on for binary 1. or all off for binary 0), as shown in Figure 145 



21 blocks wide = 42 dots wide 




position detection 
pattern 



single 
»2x2tjlack 



Figure 145. Example of 2x2 dots for each block of QR code 

The point to note here is that a single bit of data may be represented in the printed tag by an arbitrary 
printed shape. The smaUest shape is a single printed dot, while the largest shape is theoretically the whole 
tag Itself, for example a giant macrodot comprised of many printed dots in both dimensions. 

ideal generic tag definition structure allows the generation of an arbitrary printed sh^ from each bit 
of data. 



26^^ What do the bits represent? 

Given an original number of bits of data, and the desire to place those bits into a printed tag for subsequent 
retneval via a reading^scanning mechanism, the original number of bits can either be placed directly into 
the tag, or they can be redundancy^ncoded in some way. The exact form of redundancy encoding will 
depend on the tag format. 

The placement of data bits within the data area of the tag is directly related to the redundancy mechanism 
employed m the encoding scheme. The idea is generally to place data bits together in 2D so that burst 
errors are averaged out over the tag data, thus QT>ically being correctable. For example, all the bits of 
Reed-Solomon codeword would be spread out over the entire tag data area so to irunimize being affected 
by a burst error. 

Since the data encoding scheme and shape and size of the tag data area aie closely linked, it is desirable to 
have a genenc tag format structtu^e. This allows the same data structure and rendering embodiment to be 
used to render a variety of tag formats. 

26.2.2. t Fixed and variable data components 

In mtoy cases, the tag data can be reasonably divided into fixed and variable components. For example if 
a tag holds //bits of data, some of these bits may be fixed for all tags while some may vary from tag to t^g. 
For example, the Universal product code allows a country code and a company code. Since these bits don't 
change from tag to tag, these bits can be defined as fixed, and don't need to be provided to the tag encoder 
each time, thereby reducing the bandwidth when producing many tags. 
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for each tag. By reducing of vS.fl? T °^ "^"^ "^^^ '^^ ^8 diflferent 

Pletely variable. wluleSj^tg en^c<SerS^ SoiES:' ' Z^"^ 1^ '""^^ 

of tag data bits. ^ have a maximum number 

Rmdundancy-encode the tag data within the tag encoder 

X^i^bS^fn^^^^r^^^^^ 

to significant savings of bandwi^Solt? """^ "^"^ 
Scol'2sr?20tiS?6S^t;^^^^ 

tive bandwidth and internal rtOR«e lS*i?tT^^ ^^ '^"^ ^ 

encoded data was r^^ZaT "^-^ if *e 

26.3 Placement of tags on a page 

The TE places tags on the page in a triangular grid arrangement as shown in Figure 146. 

Uindsi»peortontation dotdhsctloi. 



Poftratt ertonlaiion 



(tot<lJr«c(ion 
► 



0 0 0 




Una direction 



(D © 




Une direction 

Figure 146. Placement of tags for portrait & landscape printing 



I^atS^jfcTonag^^ra^^^^ 
respond to the same pL^Tgl^Z ^^■S'Z^"^. "^^^ °° *at line cor- 

native lines of tags where onL lfnro7t.?f -^^^ ^ '"^Sular placement can be considered as alter- 
of dots is inset?fa'IffS:^^r '^f ^-T "7 "^T' '° ^^her line 

from the line inti-tag^ •"'""'^ gap is the same In both lines of tag. and is different 

*r:^e^*t"i°?n^?•;Se?^^^^^^^ 

same. Parameters of Ime and dot are swapped, but the placement mechanism is the 
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The general case for placement of tags therefore relies 



Start Position 



on a number of parameters, as shown in Figure 147. 
' ^ dot dfrectJon 



AUTaoUne PtejUor 



tag wrthJn 
tag's bounding 
box 



Tag width 
< ► 



Dot Inter-tag gap 



line Inter-tag gap 



Inadfrectton 



tagwitttin 
tag's txMndIng 
box 



tag within 
tag's bounding 

t>QX 



Tag height 



Dot inter-tag gap 




Figure 147, General representation of tag placement 
The parameters are more fonnally described in Tahl*- i^n xr *u * 

not registers. ^ oescnDed in Table 120. Note that these are placement panmietere and 

Table 120. Tag piacement parameters 









Tag width 


.i'r»^"'"^f' **' * *^ «' «»» tog-s bound- " 
dei^ * """"be, of d«8 in «„ lao itself may vary 

dote m the boundlne wfU be constant (by deffnf- 


nunmnum 1 
minimum 1 


Dot inter-tag gap 


of dots from the edge of one lag's bound- 


minimum = 0 


Line inter-tag gap 


number or dot lines from the edge of one tao's 


nr^inimum = 0 


Start Posrtion 






AftTagUnePosition 


Inter-tag gap (ttie row position is aiw^ O). 





26.4 



Basic tag encodihg parameters 
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pie matter to adjust the buffer sizes and coirespondinc 
in future unplementations. 



Tabre 121. Encoding parameters 



addressing to allow afbitraiy encoding parameters 




page width 



tag size 



N 



Ov 



number of dots in each dimension of the tag 



redundancy encoding for tag data 



size of fbced data (unencoded) 



size of redundancy-encoded fixed data 



size of variable data (unencoded) 



size of redundancy-encoded variable data 



tags per page width 



2^^ dotpatrs or 20.48 inches at 1600 dpi 



typical tag size is 2mm x 2mm 
maximum tag size is 384 dots x 384 dots 
before seating i.e. 6 mm x 6 mm at 1600 dpi 



384 dots before scaling 



Reed-Sotomon GF(2^) at 5:10 or 7:8 



40 or 56 bits 



120 bits 



120 or 112 bits 



360 or 240 bits 



85 packed 6mm x 6mm tags (384 x 384 
dots) wtP tit in 20.48 inches 



suppliedas .20 bits of pxe-^^^^^dS^nS^J^aJJiX " ' ^^'^""'^^'^ 

'^^Z^it^£S%Torl7:^^ '12 or iaOdatabits that are variable for each rag. Vari- 
26.4. 1 .'but ^^yt."::^ ^ " "^y^-^^ - lection 



26.4.1 Redundancy encoding 



bmt eiroB and effert vely^irtC^Sr^I ° 
encoding is adeqJdy2usS?.S) r^aX^^ 

background infbnnalion. ^ ^ " ^ » *»i«e sources for 

polynomial ispOO =^*TxTi ^A^^LT^L^ ? codeword length of 60 bits. The primitive 

nun,ber of sy4oi tJat ciTli^'l^ir^^^^ 

^y^^^' a« two possibiUties for encoding- 

* ^iolPn^'^t'^t^^S'toTt^^^^ lOredundancy 
- (*+a)(rfa2)...(W<^ ^ " ^* ««°««»<»r polynomial is dierefore gf^ 

' sSs'inL^l^rt^^/^:^^^ 

(rt«X*+a2)...0e+a8) cci up to 4 symbols m error. The generator polynomial is g(xj - 

able) as follows: '««h»«d^cy encoded to give a total amount of 480 bits (120 fixed, 360 vari- 
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• Each tag contains up to 40 bits of fixed original data. Therefore 2 codewords are required for the fixed 
data; giving a total encoded data size of 120 bits. Note that this fixed data only needs to be encoded 
once per page. 

« Each tag contains up to 1 20 bits of variable original data. Therefore 6 codewords are required for the 
variable data, giving a total encoded data size of 360 bits. 

In the second case, with 7 symbols of original data, the total amount of original data per tag is 168 bits (56 
fixed, 112 variable). This is redundancy encoded to give a total amount of 360 bits (120 fixed, 240 vari- 
able) as follows: 

• Each tag contains up to 56 bits of fixed original data. Therefore 2 codewords are required for fixe fixed 
data, giving a total encoded data size of 120 bits. Note that this fixed data only needs to be encoded 
once per page. 

• Each tag contains up to 1 1 2 bits of variable original data. Therefore 4 codewords are required for the 
variable data, giving a total encoded data size of 240 bits. 

The choice of data to redundancy ratio depends on the application. 

26.5 . Data structures used by tag encoder 

26.5.1 Tag Format Structure 

The Tag Format Structure (TFS) is the template used to render tags, optimized so that the tag can be ren- 
dered in real time. The TFS contains an entry for each dot position within the tag's bounding box. Each 
entry specifies whether the dot is part of the constant background pattem or part of the tag's data compo- 
nent (both fixed and variable). 

The TFS is very similar to a bitmap in that it contains one entry for each dot position of the tag*s bounding 
box. The TFS th^efore has TagHeight x TagWidth entries, where TagHeight matches the height of the 
bounding box for the tag in the line dimension, and TagWidth matches the width of the bounding box for 
the tag in the dot dimension. A single line of TFS entries for a tag is known as a tag line structure. 

The TFS consists of TagHeight nimiber of tag line structures^ one for each 1600 dpi line in the tag's 
bounding box. Each tag line structure contains three contiguous tables, known as tables A, B, and C. Table 
A contains 384 2-bit entries, one entry for each of the maximum number of dots in a single line of a tag 
(see Table 121). The actual number of entries used should match the size of die bounding box for the tag in 
the dot dimension, but all 384 entries must be present ^ Table B contains 32 9-bit data addresses that refer 
to (in order of appearance) the data dots present in the particular line. Ail 32 entries must be present, even 
if fewer are used. Table C contains two 5-bit pointers into table B, and is stored in the 10 low bits of the 
next 32-bit word (the upper 22 bits are unused). The total length of each tag line structure is therefore 34 x 
32-bit words. Padding (18 x 32-bit words) is inserted after every 7 tag line structures to keep each tag line 



1. This is done so that it is possible to go fiom one line within a tag to the next by simply adding 33 in 32-bit based addressing to DRAM. 
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1^9 Format Sfnjcture 



at 



tag tine structure 0 



tag line structure 1 



tag line structure 2 



\ 



\ 



tag line structure 6 



reserved and unused 
(18x32^) 



tag Ifne structure 8 



I 



tag line structure n 



\ 



\ 



\ 



tag line structuie 



table A 
(384 entries x 2-bns) 
(768 bits) 



table B 
(32 emrfes x g-bits} 
(288 bits) 



tabiac 
(2 entries x5-bits) 
(10 bits) 



resen^dand 
unused 
(22 bits) 



r 



Figure 148. Composition of SoPEC's tag format stnicture 

given in section 26.8.3 on page 



A fiill description of the inteipTetation and Osage of Tables A. B and C is 



26.S. i. 1 Scaling a tag 



If the size of the printed dots is too small, then the tag can be scaled in one of several wavs Either th^ f»» 

would repeat each entry across each line of th- -t^ ZJa ?u tte new TFS from the old. we 

t <i«o5s eacn une oi tbe TFS, and then we would reoeat each line nfthf tt:« tk- 

net number of entries in the TFS would be increased fourfold (2 x 2), ^ 

The TFS allows the creation of maavdots instead of simple scaline Looldnir at Viin,r^ i do » o i 

^mens'of ^^^^^ ' " ' ^""^^^ ^^^^ '^"^ simply perfonned r^pUcad^f by 

^StuHn L^^^^^ "^'^^^"^ ^^^^ TFS by 7 in each cLcnsion or putting a 

scale up on the output of the tag generator output, then we would have 9 sets of 7 x 7 square blocks 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



2TNOV 2002 
Page 366 



SoPEC : Hardware Design 



S3 



Instead, we can replace each of the original dots in the TFS by a 7 x 7 dot definition of a rounded dot Fig- 
ure 1 50 shows the results ^ 





always 1 
(background) 


atwsys 1 
(background) 


always 1 
(bad^raund) 


posmon detection pattern 








(1 \lne aUdark) 


data 


data 




data 


data area 


bitO 


b(t1 


bit2 


(2 fines of 3 bits) 








data 
bit4 


data 
bits 


data 
bus 



Figure 149. Simple 3x3 tag structure 



Legend 


■ 


constant 0 




constant 1 




data bit 0 




datab»1 


Q 


data bft2 


Q 


data bit 3 


Q 


databft4 


m 


data bits 




Frgure 150. 3x3 tag redesigned for 21 x 21 area (not simple replication) 

Consequently, the higher the resoJution of the TFS the more printed dots can be printed for each macn>dot 
where a macrodot represents a single data bit of the tag. The more dots that are avaUable to produce a mac ' 
rodot. the more complex the pattern of the macrodot can be. As an example, Figure 144 on pace 360 
shows the Netpage tag structure rendered such that the data bits are represented by an average of 8 dots x 
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8 dots (at 1 600 dpi), but the actual shape stmcturc of a dot is not square. This allows the printed Netpage 
tag to be subsequently read at any orientation. 

26.5.2 Raw tag data 

The TE requires a band of unencoded variable tag data if variable data is to be included in the tag bit- 
plane. A band of unencoded variable tag data is a set of contiguous unencoded tag data records, in order of 
encounter top left of printed band from top left to lower right. 

An unencoded tag data record is 128 bits arranged as follows: bits 0-11 1 or 0-1 19 are the bits of raw tag 
data, bit 120 is a flag used by the TE (TaglsPrinted), and the remaining 7 bits are reserved (and should be 
0). Having a record size of 128 bits simplifies the tag data access since the data of two tags fits into a 256- 
bit DRAM word. It also means that the flags can be stored apart from the tag data, thus keeping the raw tag 
data completely unrestricted. If there is an odd number of tags in line then the last DRAM read will con- 
tain a tag in the first 128 bits and padding in the finall28 bits. 

The TaglsPrinted flag allows the effective specification of a tag resoliition mask over the page. For each 
tag position the TaglsPrinted flag determines whether any of the tag is printed or not This allows arbitrary 
placement of tags on the page. For example, tags may only be printed over particular active areas of a 
page. The TaglsPrinted flag allows only those tags to be printed. TaglsPrinted is a 1 bit flag with values as 
shown in Table 1 22. 



Table 122. TaglsPrinted values 









0 


Don't print the tag In this tag position. 

Output 0 for each dot within the tag bounding box. 


1 


Print the tag as specified by the various tag structures. 





26.5.3 DRAM storage requirements 

The total DRAM storage required by a single band of raw tag data depends on the number of tags present 
in that band. Each tag requires 1 28 bits. Consequently if there are i^tags in the band, the size in DRAM is 
16N bytes. 

The maximum size of a line of tags is 163 x 128 bits. When maximally packed, a row of tags contains 163 
tags (see Table 121) and extends over a minimum of 126 print lines. This equates to 282 KBytes over a 
Letter page. 

The total DRAM storage required by a single TFS is TagHeight/l KBytes (including padding). Since the 
likely maximum value for TagHeight is 384 (given that SoPEC restricts TagWidth to 384), the maximum 
size in DRAM for a TFS is 55 KBytes. 

26.5.4 DRAM access requirements 

The TE has two separate read interfeces to DRAM for raw tag data, TD, and tag format structure. TFS. 
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The memory usage requirements are shown in Table 123. Raw tag 
store 



data is stored in the compressed page 



Table 123. Memory usage requirements 









Compressed page store 


2048 Kbytes 


Compressed data page store for BMsvet, contone and 
raw tag data. 


Tag Format Structure 


55 Kbyte (384 dot One tags 
9 1600 dpi) 


55 kB in PEC1 for 384 dot One tags (the benchmark) at 
1600 dpi 

2.5 mm tags (l/10th inch) d 1600 dpi require 160 dot 

lines = 1 60/384 xS5 or 23 kB 

2S mm tags d 800 dpi require 80/384 x55 » 12 kB 



The TD interface will read 256-bits from DRAM at a time. Each 256-bit read returns 2 times 128-bit tags. 
The TD interface to the DIU will be a 256-bit double buffer If there is an odd number of tags in line then 
the last DRAM read will contain a tag in the first 128 bits and padding in the final 128 bits. 

The TFS interface wiU also read 256-bits horn DRAM at a time. The TFS required for a line is 1 36 bytes. 
A total of 5 times 256-bit DRAM reads is required to read the TFS for a line with 192 unused bits in the 
fifth 256-bit word. A 136-byte double-line buffer will be implemented to store the TFS data. 
The TE's DIU bandwidth requirements are summarized in Table 124. 

Table 1 24. DRAM bandwidth requirements 



TD 



TFS 



Read 



Read 




Single 256 bit reads*. 



Single 256 bit reads^, TFS is 
136 bytes. Tliis means there 
is unused data in the fifth 
2S6 bit read. A total of 5 
reads is required. 



1-02 



0.093 



1.02 



0.093 



1 : Each 2min tag lasts 126 dot cycles and requires 128 bits. This is a rate of 256 bits every 252 cycles. 
2: 1 7 X 64 bit reads per line in FECI is 5 x 256 bit reads per line in SoPEC with unused bits in the last 256-bit read. 



26.5.5 Tag sizes 



SoPEC aUows for tags to be between 0 to 384 dots. A typical 2 mm tag requires 126 dots. Short tags do not 
change the internal bandwidth or throughput behaviours at alL Tag height is specified so as to aUow the 
DRAM storage for raw tag data to be specified. Minimum tag width is a condition imposed by throughput 
limitations, so if the width is too small TE cannot consistently produce 2 dots per cycle across several tags 
(also there are raw tag data bandwidth implications). Thinner tags still work, they just take longer and/or 
need scaling. 
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26.6 Implementation 



26.6.1 Tag Encoder Architecture 

A block diagram of the TE can be seen below. 



ORAM interface 



tag encoder 
unit 



tagaasensa. 
lasttfottntag - 
advtagllne - 
tfsvaRdx 



5 



LA 



TFS 
interface 



etdRdAdrO K 




etdRdAdfl 9^ 



tag data 
tntertbca 



— lastOotlnTagl 
tdVblid 




Tag Fife Unit 



c 



PCU 



Figure 151. TE Block Diagram 

The TE writes lines of bi-level tag plane data to the TFU for later reading by the HCU. The TE is resoon- 
of ^ ""^f"* *°«"^!l'»8 data with the tag structure (interpreted from the TFS). Y-integer sciing 
of ^ xs perfoi^ed m the TE with X-integer scaling of the tags perfonned in the Tf6. Tie Scoded teg 
layer .s generated 2 bits at a tune and output to the TFU at this rate. ITie HCU however only consum^f 
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The tog encoder consists of a TFS interface that loads and decodes TFS entries, a tag data interfiice that 
ba^ tag raw data^ encodes It. and provides bit values on request, and a state 

TO '^'''^^^o «^axate read interfeces to DRAM for raw^X 

TD, and tog fonmat structure, TFS. ^ 

It is possible that the raw tag dato interface, the TD. to the DIU could be replaced by a hardware stote 
tn'^ H "''I'^^^n ™* ^^^^"y ^ '^'^^'^'^ of togslipport for YlSLTn?^ 



26.6.2 Y-Scaling output lines 



ntrfl^^'^^^^ o ^ modifications to the FECI TE are suggested to 

the Tag Daia Interface, Tag Foimat Strucftire Interface and TE Top Level: « 

* ^ZJnn^^jT"^'- configuration registers of Table 126,firstTagLmeHeight and tag- 
MaxLme with true value ..e. not multipUed up by the scale factor YScale. Within die Tag dL interface 
^ are iwacountcrs. countx and county that have a direct bearing on the rawTagDatoAddr genera- 

Z^ff^"^"" ""T^ are read from DRAM. It is reset to NumTags[R.dfagSenseJ atSTJf 
eadi Ime of togs, countyjs decremented as each line of togs is completely read from DRAM i.e. caunL 
- 0. Scaling may be performed by counting the number of times countx reaches zero and only decre- 

'^^^l^HT^^ "^r^^r. ™' ^ <^ TagDato Interfece to read each 

Ime of tag dato jVMwags/R/^/T&^&Mey • KSfcofe times. «««»«»n 

• for TJg Format Smicture Interface: The implication of Y-scaling for the TFS is that each Tag Line 
Strucnire.susedrefca/etimes.ThismaybeaccompIishedineitt«H-oftwowa^^ ^ 

' A/^iJ*r''icl*'^ '"'^ ^^s- involves gating 

&e conm>l of TFS buffer flippmg with YScale. Because of the way in which this odyT/slA and 
oduTagLine related functionality is coded in the FECI TFS this solution is judged to be eiror-prone 

* SS-^i yS*^*™*"*^ '^"^^ comioUing the activity of currTf- 

r^n"^^ '^f.^l-^ "^^^^ *° ^ individual Tag Line Struc- 
^ -^n^ , ^ "^""^ 5 accesses. This is different from the behav- 

iour m PEC 1 . where one address is given and 1 7 dato-woids were lenimed by the DIU 
Smce the behaviour of the currTfsAddr must be changed to meet the requirements of the SoPEC 
DIU rt mate sense to mclude the Y-Scaling into this change i.e. a count of the number of com- 
tTjS^fLT^ compared to YScale. Only when this count equals YScale can 

w^^s^dir be loaded with the base address of the next lines Tag Unc Structure in DRAM, other- 
wise It IS re-loaded with the base address of the current lines Tag Line Strucftire in DRAM. 

' S^Iw ^"l^v "^"^ ^? "^^ ^ ^ ^"^'^ « to count the number of 

rZ n^ o"tP"g«>cs wh«i ,n a tag gap or in a line of togs. At the stort (i.e. top-left hand dot-pair) of 
mf 1^ «?f t' H^?' " T'^sGapI^'^e or TagMaxLine. The value ofUnePOs is decre- 

Tefon y£/e vST " """^ ""^ accomplished by gating the decrement of UnePos 
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26.6.3 TE Physical Hierarchy 

Tag Encoder 



Top Lever FSM 
+ PCU + Comb 
Logic for Muxing 
etc. 



Tag Data Interface 



Raw Tag Data 
tntertace 



Reed Solomon 
Encoder 



20 Decoder 



Encoded lag beta interface 



encoded 
fixed tag 
data 



encoded 
variable tag 
data 



kag ^onnai Structure ( I hS) 



Table A 



Rego/p 



Table C 



Tables 



Rego/^p 



Figure 152. TE Hlerart;hy 



^^nn ^""l stnictural hierarchy of the TE. The top level contains the Tag Data Inter- 

Zl^^^' ^ ^""^ ^ ^<>«^J ^ generation oTSt pai^llonT^T^ 

the ou^ut data and generatiag other control signals. miuung 
At the highest level, the TE state machine processes the output lines of a paee one line at a time with a. 
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I 26.6.4 lO Definitions 



Tabre 125. TE Port Ust 





3fSS 


m 




Clocks and Resets 


pdk 


1 


In 


SoPEC Functional dkxk. 


prst_n 


1 


In 


Global reset signal. 


Bandstore Signals 


cdu_endofbandstore[21 :S] 


17 


In 


Address of the end of the current band of data. 
256«btt word aligned ORAM address. 


ocfu.8lartofbandstore[21 :5] 


17 


In 


Address of the start of the current band of data. 
256-bit word aligned DRAM address. 


te.finishedband 


1 


Out 


TE finished band signal to PCU and ICU. 


PCU Interface data and control signals 


pcu.addr(6:2] 


7 


In 


PCU address bus. 7 bits are required to decode the address space 
for this Mock. 


pGUjdataout(31:0] 


32 


In 


Shared write data bus from the PCU. 


te.jx:u_datain(31 .-0) 


32 ' 


Out 


Read data bos from the TE to the PCU. 


pcu.rwn 


1 


In 


Comnxm read/not-write signal from the PCU. 


pcuLte.sel 


1 


in 


Block select from the PCU. When pcc/_f9_se/is high both 
pcu^addrand peuLdataotaare valid. 


te_pcu_rcfy 


1 


Out 


Ready signa] to the PCU. When f©_pci/_rdy is high it indicates tf>e 
last cvfde of the access. For a write cyde this means pcu_dSitaout 
has been registered by the bfocfc and for a read cyde this means 
the data on te_jjcu_datain is valid. 


TD (raw Tag Data) OIU Read Interface signals 


td_diu..jreq 


1 


Out 


TD requests ORAM read. A read request must t>e accompanied by 
a valid read address. 


t4_dhj_rad/[21:5] 


17 


Out 


TD read address to OlU. 

17 bits wide (256-bn aligned word). 


diu_td_rack 


1 


In 


Acknowledge from OIU Ifiat TD read request has t>een accepted 
aruj new read address can tie placed on tB^tSujraidr. 


dhj.data(63:0] 


64 


In 


Data from DIU to TE. 
Hrst 64-bit3 are bits 63:0 of 256 bit word; 
Second 64-bits are bits 127:64 of 256 bit word; 
Third 64-bits are bits 1d1:128 of 256 bit word; 
Fourth 644)fts are bits 255:1 92 of 256 bit word. 


diu.td.rvalid 


1 


In 


Signal from DIU telKng TD that valM read data is on the diu^data 
bus. 


TFS (Tag Format Structure) DIU Read Interface signals 


tfs_diu_rTeq 


1 


Out 


TFS requests DRAM read. A read request must be accompanied 
by a valid read address. 


tf8_dlu_radi(21:5) 


17 


Out 


TFS Read address to DIU 

17 bits wide (256-bit aligned word). 


diujtfs^rack 


1 


In 


Acknowledge from DIU that TFS read request has been accepted 
and new read address can t>e placed on tfsjdit^raxir. 
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Table 125. TE Port List 







1^: 




dHj.data[63:0) 

* 


64 


In 


Rrst 64-bit3 are bits 63:0 of 256 bit word; 
Second 64-bits are bits 127:64 of 256 bit word; 
Third 64-bits are bits 191:128 of 256 bit word; 
Fourth 64-bits are bfts 255:192 of 256 bit word. 


diu_tfs_rvarid 


1 


In 


Signal from OIU telling TFS that vaOd read data Is on the diudata 

bus. 


TFU Interface data and cont 


rol slgnafs — — 


tfu_te_okt<>write 


1 


In 


Ready signal indicating TFU has space avaRabfe and is ready to be 
written to. Also asserted from the point that the TFU has recieved 
Its expected number of bytes for a line until the next 
te^tfu wradyfine 


te_tfti_wdata[7:01 


8 


Out 


Write data for TFU. 


te.tfii.wdatavalld 


1 


Out 


Write data valid signal. This signal remains high whenever there is 
valid output data on tejttujwdata 


te_tfu_wraclvllne 


1 


Out 


Advance line signal strobed when the last byte In a line Is placed 
on tojthi_wdaia ^ 



26.6.5 Configuration Registers 

JLVls?^"'!."^?**'^ pragrammed via the PCU interface.Refer to section 21.8.2 on 

N«f. rw ^ ^^^'^ *!*^*^ ^'"^^ ti-^g <"a8rains for reading and writing registers in the 
TE.Note tha^ suice addresses in SoPEC are byte aligned aiKl the PCU only Lpports 32-bit 

rE.Table 126 lists the configuration registers in the TE. 

"^v.^?^ ^RAM word aligned as this is the case for the PECl TE. 

SoPEC assimies a 256-b.t DRAM word size. If the TE can be easily modified then the DR/Sl wort 

these the 64-bit word aligned addresses on a 256-bit DRAM woid boundary.. 



Tab<e 126. TE Configuration Regislere 



Cor 



iiDi regiswrs 



0x00 



Reset 



Go 



A write to this register causes a reset of the TE. 
This register can be read to indicate the reset state: 

0 - reset In progress 

1 - reset not in progress 



Writing 1 to this register starts the TE. Writing 0 to this 
register halts the TE. 

When Go is deasserted the state-machines go to their 
Idle states but all counters and configuration registers 
keep their values. 

When Go is asserted an counters are reset, but oon- 
ftgumtlon registers keep their values (I.e. they don't 
get reset). NextBandEnabte is cleared when Go is 
asserted. 

The TFU must be started before the TE is started. 
This register can be read to determine if the TE is run- 
nlng (1 = running, 0 = stopped). 
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Table 126. TE Configuration Registers 




Setup reglstefs (constant for processing of a page) 



0x40 



0x44 



0x48 



0x4C 



0x50 



0x54 



0x58 



OxSC 



0x60 



0x64 



0x68 



Ox6C 



0x70 



TfsStartAdr 
(64-brt aiigned DRAM 
address - should start at 
a ase-brt aligned loca 
tion) 



TfsEndAdr 

(64^ft aligned DRAM 

address - should start at 

a ase-bit aDgned (oca- 

tion) 



TfeBrstLlneAdr 
(64-bit aligned DRAM 
address) 



DataRedun 



DeoodeZDEn 



VariabTeDataPresent 



EncodeFtxed 



TagMaxDotpairs 



TagMaxUhe 



TagGapOot 



19 



Wnts to the first vM>rd of the first TFS fine In DRAM. 



19 



Points to the first word of Iho last TFS Une In DRAM. 



19 



Points to the first word of the first TFS line to be 
encountered on the page. If the start of the page is In 
an rnter-tag gap. then this value will be the same as 
TFSStartAdr since the first tag line reached win be the 
top line of a tag. 



Defines the data to redundancy ratio for the Reed 
Solomon encoder. Symbol size is always 4 Wis. Code- 
word size is ahways 15 symbols (60 bits). 

0 - 5 data symbols (20 bits), 10 redundancy symbols 
(40 bits) 

1 -7 data symbols (28 bits), 8 redundancy symbols 
(32 bits) 



Determines whether or not the data tjits are to be 2D 
decoded rather than redundancy encoded (each 2 
bits of the data bits becomes 4 output data bits). 

0 s redundancy encode data 

1 = decode each 2 bits of data into 4 bits 



Defines whether or not there is variable data in the 
tags. If there is none, no anempt Is made to read tag 
data, and tag encoding should only reference fixed 
tag data. 



Determines whether or not the lower 40 (or 56) bits of 
fixed data should be encoded Into 120 bits or strriply 
used as is. 



The width of a tag in dot-pairs, minus 1 . 
Minimum 0, Maximumsi 91. 



The numt}er of lines in a tag. minus 1 . 
Minimum 0, Maximum = 383. 



14 



TagQapUne 



DolPalrsPerUne 



DotStartTagSense 



14 



The number of dot pairs between tags in the dot 
dimension minus 1 . 
Only vatid if TagGapPresert^h 0) = 1 . 



14 



Defines the number of dotlines between tags in the 
line dimension minus 1 . 
Only valid It TagGapPresen^bm] = i . 



Number of output dot pairs to generate per tag line. 



Determines for the first/even (bit 0) and second/odd 
(bit 1 ) rows of tags whether or not the first dot position 
ot the line is in a tag. 
1 = in a tag. 0 = in an inter-tag gap. 
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Table 126. TE Configuration Registers 




0x78 



0x80 to 
0x84 



0x88 to 
0x8C 



TagGapPresent 



YScale 



DotStartPos 



NumTags 



Setup band related reglatera 



2x14 



2x8 



Bit 0 is 1 if there is an Inter-tag gap In the dot dimen- 
sfon. and 0 If tags are tightly packed. 
Bit 1 rs 1 if there is an inter-tag gap in the line dimen- 
jion. and 0 if tags are tightly packed. 



Tag scale factor in Y directwn. Output lines to the TFU 
will be generated YScale times. 



DetemUnes for the firsVeven (0) and second/odd (1) 
rows of tags.the number of dotpairs remaining minus 
1 . In either the tag or inter -tag gap at the start of the 
line. 



Oetennines for the first/even and second/odd rows of 
tags how many tage are present in a fine (equals 
number of tags minus 1). 



OxCO 



0xC4 



OxCd 



OxCC 



NextBandStartTagOa- 
taAdr 

(64-btl aligned ORAM 
address - shouM start at 
a 2S6-bit aligned loca- 
tion) 



NextBandEndOfTagOata 
(64-blt aligned ORAM 
address) 



NextBandRfstTagUne- 
Height 



NextBandEnable 



Holds the value of StarfTagDataAdr for the next band. 
This value Is copied to StartTagOataAdr when 
OoneBand is 1 and NextBandEnable la 1 . or when Go 
transitions from 0 to 1 . 



Holds the value of EndOfTagOata for the n&d band. 
This value Is copied to EndOfTagOata %vhen 
OoneBand Is 1 and NextBandEnable is 1, or when Go 
transitions from 0 to 1. 



Holds the value of FirBlTagUneHeight for the next 
band. This value is copied to RretTagUneHelght when 
OoneBand gets is 1 and NextBandEnable Is 1. or 
when Go transJtfons from 0 to 1. 



When NextBandEnable is 1 and OoneBand is 1 . then 
when te_finishedband is set at the end of a band: 
-NextBandStartTagDataAdr is copied to StartTagOa- 
taAdr 

-NextBandEndOfTagOata Is copied to EndOfTagOata 

-NextBandFirstTagUneHeight Is copied to RrsfTa- 

gUneHeight 

-OoneBand is cleared 

NextBandEnable is cleared. 

NextBarnSEnabte is cleared when Go is asserted. 



Read-only band related registers 
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Table 126. TE Configuration Registers 

















1 


0 


specifies whether the tag data intertace has finished 

loading all the tag data for the band. 

it is cleared to 0 when Go traositions from 0 to 1. 

When the tag data interface has finished loading all 

the tag data for the band, the te finishedbanri ^irmai 

Is grven out and the DoneBandttaQ Is set. 

If NextBandEnable isl at this time then startTagDa- 

taAdr, endOfTagDataand ffrstTagUneHeight are 

updated with the values for the next band and 

DoneBand Is deared. Processing of the next band 

starts immediateiy. 

If NextBand£nabfB is 0 then the remainder of the TE 
wni continue to njn„ while the read control unit watts 

for /VaxfBandfnatte to be set before it restarts. Read 
only. 




SlartTagDataAdr 
(64-bn aligned DRAM 
address • should start at 
a 256-bit aligned loca- 
tion) 


19 


0 


The start address of the cunent row of raw tag data. 
This is tnitfally points to the first word of the band's tag 
data, which should be aligned to a 128-bit boundary 
(i.e. the lower bit of this address should be 0). Read 
only. 


0xD8 


EndOfTagData 
(64<bit aligned DRAM 
address) 


19 


0 


Points to the address of the finaJ tag for the band. 
When all the tag data up to and including address 
mSOfTagData has been read in, the te finishadband 
signal is given and the doneBand flag is set Read 
only. 


OxOC 


FirsUagUneHeight 


9 


0 


The number of lines ntinus 1 in the first tag encoun- 
tered in this t>and. This will be equal to TagMaxUne if 
the band starts at a tag boundary. Read only. 


Work registi 


srs <set before starting tlie 


TE and must not be touched betwreen bands) 


0x100 


UnelnTag 


1 


0 


Determines whether or not the first line of the page Is 
in a line of tags or in an inter-tag gap. 
1 - in a tag, 0 - In an inter-tag gap. 


0x104 


LineRos 


14 


0 


The number of Unes remaining minus 1, in either the 
tag or the Inter-tag gap In at the start of the page. 


0x110 to 
0x1 1C 


TagData 


4x32 


0 


This 128 bit register must be set up initially witti the 
fixed data record for the page. This is either the lower 
40 (or 56) bits (and the encodd/Trad register should 
be set), or the lower 1 20 bits (and encodedFbced 
should be dear). The tagDataio] register contains the 
lower 32 bits and the tagData(3J register contains the 
upper 32 bits. 

This register is used throughout the tag encoding 
process to hold the next tag's variable data. 


Woric registei 
Read-onfy fro 


's (set ifUemally) — ^ — — 

>m the poJnt of view of PCU register access 


0x140 


OolPos 


14 


0 


Defines the number of dotpairs remaining in either the 
tag or inter-tag gap. Does not need to be setup. 


0x144 


CurrTagPlaneAdr 


14 


0 


The dot-pair number being genemted. 


0x148 


DotslnTag 


1 


0 


Determines whether the current dot pair is in a tag or 
not 

1 ' in a tag, 0 - in an inter-tag gap. 
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Table 126. TE Configuration Registers 




0xl4C 



0x164 



0x156 



CurrTFSAdr(64-bft 
aligned ORAM address) 



ReadsRemaJnlng 



19 




Determines whether the producUon of output dots Is 
for the first (and subsequent even) or second (and 
subsequent odd) row of tags. 



Points to the start next line of the TFS to be read in. 



Number ot reads remaining In the current burst from 
the raw tag data interface 



0x160 



0x164 



0x168 



CounfY 



The number of tags remaining to be read (minus 1 ) by 
the raw lag data Interface for the cun-ent line. 



The nunrUser of times (minus 1) the tag data for the 
current line of tags needs to be read in by the raw tag 
data Interface. 



RtdTagSense 



RawTagDataAdr 
(64^ aligned ORAM 
address) 



Detennines whether the raw tag data interfece is cur- 
rently reading even rows of tags (=^) or odd rows of 
tags (=1) with respect to the start of the page. Note 
that this can be different from tagAltSense since the 
raw tag data interface is reading ahead of the produc- 
tion of dots. 



19 



The current read address within the unenooded raw 
tag data. 



The PCU accessible registers are divided amongst the TE top level and the TE sub-blocks Hiis is achieved 
SlSe " « ^ «op level, see Figure 153 S o^r 4^7™ 



control 
pcu_dataout[31:0J. 






read 
^decode 



sub-block 



top level 



tej)cu.data{n[31:0] 



te_pcu_rdy 



Figure 153. Blocic diagram of PCU accesses 
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26.6.5. 1 Starting the TE and restarting the TE between bands 



The TE must be started after the TFU. 

For the first band of data, users set up NextBandStartTagDatoAdr, NextBandEndTagData and NextBand- 
FirstTagLincHeight as well as other TE configuration registers. Users then set the TE's Go bit to start pro- 
cessing of the band. When the tag data for the band has finished being decoded, the tejinishedband 
intemipt will be sent to the PCU and ICU indicating that the memoiy associated with the first band is now 
free. Processing can now start on the next band of tag data. 

In order to process the next band NextBandStattTagDataAdr, NextBandEndTagData and NextBandFirst" 
TagLineHeight need to be updated before writing a 1 to NextBandEnable, There are 4 mechanisms for 
restarting the TE between bands: 

a. tejinishedband causes an interrupt to the CPU. The TE will have set its DoneSand bit The 
CPU reprograms the NextBandStartTagDataAdr, NextBandEndTagData and NextBandFirstTa' 
gLineHeight registers, and sets NextBandEnable to restart the TE. 

b, Thc CPU programs the TE's NextBandStartTagDataAdr, NextBandEndTagData and NextBand- 
FirstTagLineHeight registers and sets the NextBandEnable flag before the end of the current 
band. At the end of the current band the TE sets DoneBand As NextBandEnable is already 1 , 
the TE starts processing the next band immediately. 

cThe PCU is progiammed so that tejinishedband triggers the PCU to execute commands from 
DRAM to reprogram the NextBandStartTagDataAdr, NextBandEndTagData and Next- 
BandFirstTagUneHeight registers and set the NextBandEnable bit to start the TE processing 
the next band. The advantage of this scheme is that the CPU could process band headers in 
advance and store the band commands in DRAM ready for execution. 

d.This is a combination of Z> and c above. The PCU (rather than the CPU in b) programs the TE's 
NextBandStartTagDataAdr. NextBandEndTagData and NextBandFirstTagLineHeight registers 
and sets the NextBandEnable bit before the end of the current band. At the end of the current 
band the TE sets DoneBand and pulses tejinishedband. As NextBandEnable is already 1, die 
TE starts processing the next band immediately. Simultaneously, te Jinishedband triggers the 
PCU to fetch commands from DRAM. The TE will have restarted by the time the PCU has 
fetched commands fi^om DRAM. The PCU commands program the TE next band shadow reg- 
isters and sets the NextBandEnable bit. 

After the first tag on the page, all bands have their first tag start at the top i.e. NextBandFirstTagLineHeight 
= TagMaxLine. Therefore the same value of NextBandFirstTagLineHeight will normally be used for all 
bands. Certainly, NextBandFirstTagLineHeight should not need to change after the second time it is pro- 
granuned. 
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I 26.6.6 TE Top Level FSM 

The following diagram illustrates the states in the FSM. 

Reset ORfio«an 

i 



Idle 



3 



Go— 1 



^TagPotLiro^^ — j 



while prpAiciny valid tag \\p^ 



Figure 154. Tag Encoder Top-Level FSM 

At the highest level, the TE state machine steps through the output lines of a page one line at a time, with 
the starting position cither in an inter-tag gap (signal dotsintag = 0) or in a tag (signals tfsvalid and tdvalid 
and lineintag = 1) (a SoPEC may be only printing part of a tag due to multiple SoPECs printing a single 
line). f o o- 

If the current position is within an inter-tag gap, an output of 0 is generated. If the current position is 
within a tag, the tag format strucmre is used to determine the value of the output dot, using the appropriate 
encoded data bit from the 6xed or variable data buffers as necessary. The TE then advances along the line 
of dots, moving through tags and inter-tag gaps according to the tag placement parameters. 
Table 127 highlights the signals used within the FSM. 



Table 127. Signals used within TE top level FSM 









Sync dock used to register all data within the FSM 


prst_n, te.reset 


Reset signals 


advtaglino 


1 cycles pulse indicating to 7DI and TFS sub-tjJocks to move onto the next line of 
Tag data 


currdotlmeadf(1 3:0] 


Address counter starting 2 pcik ahead of currtagplaneadr to generate the correct 
dotpair for the current line 


dotpos 


Counter to identify how many dotpairs wide the tag/gap is 


dotsintag 


Signal klentifying whether the dotpair are in a tag(1 ygap(0) 


Gneintag.temp 


Identical to lineintag but generated 1 pdk earner 


linepos.8hadow 


Shadow register for Unepos due to Hnepos being written to by 2 different proc- 
esses 


talaltsense 


Flag which alternates between tag/gap lines 


te.state 


FSM state variable 


teplanebut 


6-fait shift register used to format dotpairs into a byte for the TFU 


wradvUne 


Advance line signal strobed when the last byte in a line is placed on to^tfu wdata 
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Due to the 2 system clock delay in the TFS (both Table A and Table B outputs are registered) the TE FSM 
IS working 2 system clock cycles AHEAD of the logic genemting the write data for Ae TFU. As a resuh 
the following control signals had to be single/double registered on the system clock 



dotsintag - 
tdvalid ' 
tfsvalid ■ 
tfu_ok_write - 
lineintag_teinp - 



pclk 



dotsintag 1 


tdvalidl 




tfsvalidl 


► 


tfu_ok_writel 


► 


lineintagi ^ 


^ 



A 



-► dotsintag! 
-►tdvalid! 
■>tfsvalid2 
-►tfu_oi^write2 



Figure 155. Generated Control Signals 
The tag_dotJine state can be broken down into 3 diflferent stages. 

S^^'J'„J5^^* !f ^ ^ ^° becoming active. TTus state controls the 

^^ It^^ the TFU. As long as the tag line buffer address is not equal to the do^airsperU^e 
S^S^ »»d ^-re_ofa«,..nre is active, and there is valid TFS and TD arable or tag^. dotpairs 

^ Sfr^"^? ^y^^ *° «^ « ^ internes not^ 

phed to the TFU since the TFU is a HFO rather than the line store used in FECI . X ««« "^o^ sup 

^ZJTTff!^^ ^ '^^'^'^ ^'^'■'^'^ flag = 1) the dot position counter do^ is decre- 

mented/reloaded (with tagmaxdotpairs or taggapdot) as the TE moves between tags/gaps. "Se dotsintag 

SfrS^S T'^PJ ^^'^'^'^^'P' ' foratag).TTus pattern continues^ the end of a2»3S 

fo?r^^'?,r'tl'^^°'^ the end of the dotiine the lineintag and tagaltsense signals must be prepared 
for the next dothne be it m a tag/gap dotiine or a purely gap dotiine. 



f *' ^ '"^^ *° decrement the linepos counter if still in 

Lo&T '"^""^ '^'^"^ ""^"^ '^'^ '^''^'^i flag if going onto 

another tag/gap or pure gap row. Any signal with the _temp extension means this r^istw is WKlated a 

^ti^^JT^" = 0 the end of a tag row is reached. This stage uses the signals lirZfZ-tenJZl 
te^a/tteare which were generated one jry^temctoct cycle earlier in Stage!. «_ -au 



^'^f and also unplements the counter for the currtagplaneadr. The cumagpla- 

Zr t^^ reaching camagp/orearfr - (^/p««per/i«e - 1). All the qualifier signals e.g dotsTntag 
J?":!^ ^ n^"^*" ^'~=* '^'^ currtagpt^^eadr (which is the internal writ! 

a^s not needed by the TFU) camiot be incremented until the dotpairs are available which is always 2 
system clock cycles later than when currdotlineadr is incremented. 
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The wradvline and advtagline pulses are generated using the same logic (currently separated in the PECl 
Tag Encoder VHDL for clarity). Both of these pulses used to update further registers hence the reason they 
do not use the delayed by 2 system clock cycle qualifiers. 



26.6.7 Combinational Logic 



The TDI is responsible for providing the information data for a tag while the TFSI is responsible for decid- 
ing whether a particular dot on the tag should be printed as background pattern or tag information. Evexy 
dot within a tag's boundary is either an information dot or part of the backgroimd pattern. 



.dots{0] 




dots[1] 



dotsintag 

Figure 156. Logic to combine dot Infomiation and Encoded Data 

The resulting lines of dots are stored in the TFU. 

The TFSI reads one Tag Line Structure (TLS) from the DIU for eveiy dot line of tags. Depending on the 
current printing position within the tag (indicated by the signal tagdotnum\ the TPS interface outputs dot 
information for two dots and if necessary the corresponding read addresses for encoded tag data. The read 
address arc supplied to the TDI which outputs the corresponding data values. 

These data values {tdi^etdO and tdi^etdl) are then combined with the dot information (jfsi^taJlotO and 
tfsijtaJLotl) to produce the dot values that will actually be printed on the page idots\ see Figure 1 56. 



lastdotintagi 



dotsiatag 
tf svalid 
t dvaird 
dolBfiS. 




dotpaifso eriine 



Figure 157. Generation of Lastdotintag/1 

The signal lastdotintag is generated by checking that the dots are in a tag (dotsintag I) and that the dot- 
position counter dotpos is equal to zero. It is also used by the TFS to load the index address register with 
zeros at the end of a tag as this is always the starting index when going from one tag to the next lastdotin- 
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«!a^ S^ted with arfvto^/m« in the TFSi CTable C) where adv_tfsjine pulse is used to update the Table C 
address teg for the new tag line - this is because lastdotintag occurs a cycle ^»T\^t\Z i t u 

w«.ld result in ti« wrong Table C value for the last dotp^rS.r^tS'^^ 
(etd_sw.tch state) to pulse the eui_ad^ag signal hence switching buffers if the E?S the txl^ 

^^SJ^l'^'^"^^ to. WoftVto^ except it is combinatorially generated (1 cycle earlier 

th^ U^tdoun^ag, except at the end of a tagline). lastdoiintagl signal is only ifd in thrTDi To r«« ie 
td^ahd signal on the cycle when dotpos - 0. Note the UNs1gnId(c«^ J/z^ei^J = IjJiS SSSS 

WoftiirqgL^cn process as this is an combinatorial process. yP^vatr^Penme) 2 as in the 



dotsintagi 

tffivaltrti ■ ^ 



lineifitaol 



te tibi ftktnwritoi 



3^ 




dotposvalid 



Ffgure 158. Generation of Dot Position Valid 

J^J^^fil^^ " '^f ^ * OineimagJ = 1). dots being in a tag 

iSSfl^^LJtTN 'i'^J 1^ fo^f ^-tn-t-e available (j^Wufy - if and hi^ enS^g dS 
avajiabie (r^/W/ - 1). Note that each of the qualifier signals are delayed by 1 ndk cvcle due to thJlZiT 
Sclo^^ ? ""^'^ ^o^Po^Sis used TS^X/Sf^^ i ^t^d^^ 



dotsintag 
tfsvaficl2 
tdvalid2- 
currtagplaneadr 





Lop'c 










[13:2] 












► 



^ te_tfu_we 



^ te_tlbi_wradr 



Figure 159. Generation of write enable to the TFU 
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The signal tejju^wdatavalid can only be active if in a taggap or if valid tag data is available {tdvalidl and 
tfsvalid2) and the currtagpplaneadr{\ :0) equal 1 1 i.e. a byte of data has been generated by combining four 
dotpairs. 



tagmaxdotpairs 
> 




tagdotnum 


a 




► 



'y"dotpos 

Figure 160, Generation of Tag Dot Number 

The signal tagdotnum tells the TFS how many dotpaiis remaiti in a tag/gap. It is calculated by subtracting 
the value in die dotpos counter ftom the value programmed in the tagmaxdotpairs register. 
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26.7 Tag Data Interface (TDi) 

26.7.1 I/O Specification 



Table 128. TDI Port List 







ufocKs ano Kesets 


pdk 


1 


J SoPEC system dock 


prst_n 


1 1" 


j ActlvB-low, synchronous reset in pdk domain. 


Diu Read Interface Signals ' 


diu_data[63K}] 


In 


Data from ORAM. 


td_diu_rreq 


Out 


Data request to DRAM. 


id.diu_radr(21:5] 


Out 


Read address to ORAM. 


dlu^td.fack 


In 


Data acknowledge from DRAM. 


dlu_td_rvalid 


tn 


Data valid signal from DRAM. 


PCU Interface Data, Control Sfgna 


Is and 


pcu_dataout[31:0] 


In 


PCU writes this data. 


pcu_addr[8:2] 


In 


PCU accesses this address. 


pcu_rwn 


In 


Gtobai read/writeniot signal from PCU. 


pcu_fe_seJ 


tn 


PCU selects TE for r/W access. 


pcu„te_resel 


(n 


PCU reset 


M^te_doneband 

td_ta_dataredun 

td.te_decode2den 

tdjte.variabledatapresent 

td.te.encodefixed 

td_te_numtagsO 

td_te_numlags1 

td_te_3tarttagdataadr 

td_te_rawtagdataadr 

td_te_endoflagdata 

td_te_fir5ttagnnehelght 

td_te_tagdatoO 

td_te_tagdata1 

td_te_tagdata2 

td.te_tagdata3 

ld„te_countx 

td_te_county 

td_te_rtdtagsen8© 

td_te_readsremalnlng 


Out 


PCU readable registers. 


TFS (Tag Format Structure) ~" ~ 


tfsLadrt)C8:0] | 


In 


Read address lor dotO 


tfsi_adr1(8:0] ( 


In 


Read address for doti 


Bandstore Signals '■ '• 


cdu_startofbandstore{24.t>] 


In 


Start memory area allocated for page bands 


cdu_endofbandstore[24:0) 


In 


Last address of the memory alk>cated for page bands 


te^finishedband 


Out 


Tag encoder band finished 
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i3 



ORAM Jntertice 




-> tdV^Pd 
— lastOotlnTag 
lastDotlnTagl 



taglsPrinted 



Figure 161. TDI Architecture 



26.7.2 Introduction 



The tag data interface is responsible for obtaining the raw tag data and encoding it as required by the tag 
encoder. The smallest typical tag placement is 2iiim x 2inin, which means a tag is at least 126 1600 dpi 
dots wide. *^ 

to PECl, in order to keep up with the HCU which processes 2 dots per cycle, the tag data interface has 
been designed to be capable of encoding a tag in 63 cycles. This is actually accomplished in approximately 
52 cycles within PECl. For SoPEC the TE need only produce one dot per cycle; it should be able to pro- 
duce tags m no more than twice the time taken by the PECl TE. Moreover, any change in implementation 
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from two dots to one dot per cycle should not lose the 63/52 cycle perfoimance edge attained in the PECl 

^ a raw tag data interface FSM that fetches tag data 

from DRAM, two symbol^-a-fame GF(2*) Reed-Solomon encoders, an encoded data interface and a state 
^ the cncodmg process. It also contains a tagData register that needs to be set up to 

hold the fixed tag data for the page. ^ 

tt^''o?tioi^ bSr""^ '^^'^ °° registers TE_encodefixed. TE_dataredun and TE_decode2den 

• (1 5.5) RS coding where every 5 input symbols are used to produce 1 5 output symbols, so the output is 
3 times the size of the uq)ut. This can be performed on fixed and variable tag data. 

' nISl^!/°*^f ' '^T ^ "P"* ^y^^^ P^~*"=« 15 °"tP"t symbols, so for the same 

num^r of symbo s. the output is not as large as the (15.5) code (for more details see section 
26.7.6 on page 400). This can be performed on fixed and variable tag data. 

' and*^bte 2 input bits are used to produce 4 output bits. This can be performed on fixed 

' ax^°d£S ^'ir ^ ^ "^^^ ^ ^ Interface. This can be perfomied on 

Each tag is made up of fixed tag data (Le. this data is die same for each tag on the page) and variable tae 
data (i.e. different for eacKtag on the page). ana vanaoie tag 

?s^L^f • o^IkI'^'''*;'-'*?''^ " ^^^^ ^^^''^ « Once the faed t^ato 

IS coded It is 1 20-bits long. It is dien stored in the Encoded Tag Data Interfece. 

The variable tag data is stored in the DRAM in uncoded form. When (15,5) coding is required, the 120- 

240-b.ts. When 2D decoding is required the 120-bits stored in DRAM are con- 
verted mto 240-bits. In each case the encoded bits are stored in the Encoded Tag Data Interface. 
The encoded fixed and variable tag data are eventually used to print the tag. 

i^L* " ^ ^ *e start of a page. It is encoded as necessary and 

2 of the 8xl5-bits registeis/RAMs in the Encoded Tag Data Interface. This data^ns 
unchanged m die registers/RAMs until the next page is ready to be processed. 

^v«^i?*f ^Tf*^ ^ ***** ^« *s stored in four 32.bit words. The TE re-reads 

J Jt^^ tag date, for a particular tag from DRAM, every time it produces that tag. The variable tee 
data FIFO which reads from DRAM has enough space to store 4 tags variaoie ug 
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J3 



I 26.7.3 Data Flow 

An overview of the dataflow through the TDI can be seen in Figure 162 below. 



ENCODED TAQ DATA INTERFACE 

■Encoded ftxsd dau can bo up to 120 bits lona 
•Usa 2 butters to anow fior 2 s^nutianeousJy 
READS in ona cyda 

-TheM stom hold the fted tag data tor i lag. 
-Total memory- 120*2 -240 bits 



RAW TAG DATA INTERFACE 
8*64 



TAO DATA REGISTER 




REED SOLOMON/ 
DECODE 20 



-The requested tag is READ 
into Ihts 12S-btt buffer. 
'This buffer can be updated 
uptoieStimesAne. 
-Each too be loaded 
at least 126 limea. 



-mh doMao 126 (specifled} 
•4naDc dots/Sne • 1600x12.6 « 20480 
-msx tagsffne ■ 2046(yi26 • 163 
-maxnariabredataAao- 120 
-max amoum of lao data/line ■ 120x164 
•Spot tne 120 lag data bits kito 2x84-bits(8sparet^) 
^i teun emofy needed tor 1 line of tag data - 2x64x164 • 656x32 
•Divide this in hatr to alow tor slnmitineous READ/WRrTE 
Ono* aUthls data is loaded It v*I be ¥afid tor at least 1261nes. 
-Fim spociflcatf »v nusi be able to p(^^ 

contains 20460x126 - 2580480 dots. 
•Therefore ihe data wll be updated at most wery 1290240 cycie&. 
-Totaf memory. 164x2x64 • 20992-tfts 
-TJieatorB uses OMbft addresstng. Bt-9 intetes *i*lchbu«ef. 
-Onra prfnttog has staned each haff buffer has 1/2 a line bi wNch to be loaded 
Le. lor a 12^ inch One U haa 10240 dots or 6120 cyotos 
ior an 8 Inch ine It has 6400 dots or 3200 cycles 




-Have to be able to read one tag^ data 
llrom Ihe RawrHtg Data Interface. RS 
encode and store H In the Encoded Tag 
Data Imertee fn 63 cydss or less. 



-Encoded variable data can be up to 360 bits tong 
-Use 2 buffers to anew tor 2 elmultaneoualy 
REAOs In one cycle. 

-Use 2 buffers to aUowr tor sanultwteously 
REAOmRITE 

-Total memory . 360x2x2 - 1280 bits 
<Mh tag MidA • 126 dots 

eo Ihe fastest otat liag can be lead • I2ft2 - 63 cyoes 



Figure 162. Data Flow Through tho TDI 

The TD interface consists of the following main sections: 

• the Raw Tag Data Interface • fetches tag data from DRAM; 

• the tag data register; 

• 2 Reed Solomon encoders - each encodes one 4-bit symbol at a time; 

• *e Encoded Tag Data Interface - supplies encoded tag data for output; 

• Two 2D decoders. 

The niain pcrfonnance specification for PECl is that the TE must be able to output data at a continuous 
rate of 2 dots per cycle. 
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26.7.4 Raw tag data interface 

The raw tag data interface (RTDI) provides a single means of accessing law tag data in DRAM The RTDI 
^ses tag data mto a FIFO where it can be subsequently read as required. The 64-bit output from the 

yr°^'^^'^^,^!^!*^y''^^^^^ being used to set/reset as the enable signal 

{rtdAvail). The FIFO is clocked out with receipt of an rtdRd signal ftom the TS FSM. 

Figure 163 shows a block diagram of the raw tag data interfece 



DRAM Interflaoe 



raw tag data 
intarfeca 



? *" 



law tag data 
RFO 



dlu.data(B3:0) 
wrptr 

rtd_frfo_*w_en 

rdptr 

pdk 



rtdbuft64:0] 



] 



17 



te.flnishedband 




rtdbufI63:0] 



(2* fldlMif data registered in Tag Data Reg) 




Frgure 163. Raw tag data interface block diagram 



26.7.4.1 RTDI FSM 



The RTDI state machine is responsible for keeping the raw tag FIFO full. The state machine reads the line 
ot teg data once for each printline that uses the tag. This means a given line of tag data will be read at least 
126 times sm^ the tag height is 126 lines for 2 mm tags. Note that the first line of tag data may be read 
fewer than 1 26 toes since the start of the page may be within a tag. In addition odd and even rows of tags 
may contam different numbers of tags. ^ 
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Section 26.6.5.1 outlines how to start the TE and restart it between bands. Users must set the NextBand- 
StartTagDataAdr, NextBandEndOJTagData, NextBandTtrstlbgLineHeight and numTagsfO], numTagsf 
registers before starting the TE by asserting Go. 

To the tog encoder for second and subsequent bands of a page, the NextBandStartTagDataAdr, 

NextBandEndOJTagData and NextBandFirstTagLineHeigh registers need to be updated (typically 
"""^^r^^^^ "'""'^''SHIJ will be the same if the previous band contains an even number of tag rows) 
ttrEh^^^n"^^^' ^" * description of the four ways of reprogrBmming 

The tag dato is read once for every printline containing togs. When maximally packed, a row of toes con- 
toMS 163 togs (see ■Rible 121 on page 364). ^ i- . s 

The RTDI State Flow diagram is shown in Figure 164. An explanation of the stotes follows: 
idle state:- Stoy in die idle stete if there is no variable dato present. If there is variable dato present and 
ttere are at least 4 spaces left in the FIFO then request a burst of 2 tegs from the DRAM (1 • 256bits) 
Counter countx is assigned the number of tegs in a even/odd line which depends on the value of registn 
rtdtagsense. Down-counter county is assigned the number of dot lines high a teg wUl be (min 126) Ini- 
tially It must be set ihc/irsttaglineheig/tt value as the TE may be between pages (i.e. a partial tog) For nor- 
mal tog generation county will take the value of tagmaxline legister. 

i!^^''' Stete wiU generate a request to the DRAM if there are at least 4 spaces in the 

■ ^l.'^^'^'^'^ ^'hich is incremented/decremented on writes/reads 

Vn^^F^', n^'/^r-^-j'^Tfr « »<«s than 4 (FIFO is 8 high) there must be 4 locations free. A 
control Mgnal called td_dm_radrvalid is generated for the duration of the DRAM burst access. Addresses 
TC Sde" °f ' '^'^^ bwst_count controls this signal. (wiU involve modification to existing 

If diere is an odd number of tags in line then the last DRAM read will contain a tog in the first 128 bite and 
padding m the final 128 bits. 

M>Joadi. This stote controls the addressing to the DRAM. Counters countx and county are used to moni- 
tor >**ether the TE is processing a line of dots within a row of tegs. When countx is zero it me^ns all tog 
dote for flus row are complete. When county is zero it means the TE is on the last line of dots (prior to Y 
scalingyfor this row of tags. When a row of tegs is complete die sense of ridtagsense is invited (odd/ 
even). The rtMagdataadr is compared to the te_endoftagdata address. If rawtagdataadr = endofiagdata 
the donelmnd signal is set, tiic/inishedband signal is pulsed, and the FSM enters the rtd^tall stote until 
the doneband signal is reset to zero by the PCU by which time the rawtagdata, endoftagedata and Jirstta- 
£/weA«^Ar registers are settip with new values to restart the TE. This state is used to count the 64-bit reads 
from the DIU. Each time diu_td_rvaM is high rtd_data_count is incremented by 1. The compare of 
rtrf rfoto = nd_num is neccessary to find out when either all 4*64-bit date has been received or 

n 64-bit date (depending on a match of rawtagdataadr - endofiagdata in the middle of a set of 4*64-bit 
values bemg returned by the DIU. 

rtd^talh. This state waite for the the doneband signal to be reset (see page 379 for a description of how 
this occuis). Once reset die FSM returns to the idle state. TTiis states also performs the same count on the 
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15 



diu_jdata read as above in the case where diujtd^rvalid has not gone high by the time the addressing is 
complete and the end of band data has been reached i.e. rawtagdataadr = endoftagdata 



variabfedatapresent = O 



a: 



IDLE 



J 



QO«8l ANDwr rd countPr^ IS 



end of 
burst 



DIU^ACCESS 



^FIFOJ 



LOAD 



dpneband 



0 



doneband = 1 



RTD_STALL 



Figure 164. RTOI State Flow Diagram 



ORAM addresses 



address 
Increasing 



bieirxiNVl 



odiCStartofbandsiore 

T^_endoftagdata (ftir band N) 

TE^endoftagdata (for band N+1) 
cdunendofbandstore 



Figure 165. Relationship between TE.endoftagdata, cdu.startofbandstore and 

cdu_endofbandstore 
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26.7.5 TDI state machine 



The tag data state machine has two processing phases. The first processing phase is to encode the fixed tag 
data stored in the 128-bit (2 x 64-bit) tag data register. The second is to encode tag data as it is required by 
the tag encoder. 

When the Tag Encoder is started up, the fixed tag data is already preloaded in the 128 bit tag data record. If 
encodeFixed is set, then the 2 codewords stored in the lower bits of the tag data record need to be encoded: 
40 bits if datoRedun = 0, and 56 bits lidatoRedun = 1. li encodeFixed is clear, then the lower 120 bits of 
the tag data record must be passed to the encoded tag data interface without being encoded. 

When encodeFixed is set, the symbols derived from codeword 0 are written to codeword 6 and the sym- 
bols derived from codeword I arc written to codeword 7, The data symbols are stored first and then the 
remaining redundancy symbols are stored afterwards, for a total of 15 symbols. Thus, when dataRedun «= 
0, the 5 symbols derived from bits 0-19 are written to symbols 0-4, and the redundancy symbols are writ- 
ten to symbols 5-1 4, When dataRedun = 1, the 7 symbols derived from bits 0-27 are written to symbols 0- 
6, and the redundancy symbols are written to symbols 7-14. 

When encodeFixed is clear, the 120 bits of fixed data is copied directly to codewords 6 and 7. 
The TDI State Flow diagram is shown in Figure 166. An esq^lanation of the stales follows. 




\mm\ AMD dongfateNj O 



fixed^datai 



^WQdefbfftfl mm 1 AMD riatftwiflijn — 1 AMfl dacodggrtwn n 



decode_2d_15_7j 



fdecode.2d.l6_5) 



bypa88_to_edll] 



fs_1S_5 



rs_l6_7 



etd,buf_switchj4^^^ariahh^tm?m^flTj^ 



Yariaatfiilatannaftnt -»,t 



read.tag_<lata) 



datafBAwi ■» n i 



Lai 



load^tagLdataj 

arBiJun°-1 AhfPdflC«teaiftna-l ^ ^ rfatamduft — 1 AMD itegndAiHfn n 



Figure 166. TDi State Flow Diagram 

idle> In the idle state wait for the tag encoder go signal - top_go = 1. The first task is to either store or 
encode the Fixed data. Once the Fixed data is stored or encoded/stored the donefixed flag is set. If there is 
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no variable data the FSM returns to the idle state hence the reason to check the don^d flag before 
advancing i.e. only store/encode the fixed data once. 

In the fixed_data state the FSM must decode whether to direcUy store the fixed data in the 
ETDi or If the fixed data needs to be either (1 5:5) (40.bits) or (1 5:7) (56-bits) RS eacoded or 2D decoded 
The ralues stored in registers encodt^ed and dataredm and decode2den detennine what the next state 
Should be. 

bypass_to_adi> The bypass_to_etdi takes 120-bits of fixed data(pre-encoded) from the taz.data(l27 0) 
register and stores it in the 15*8 (by 2 for simultaneous reads) buffeis. The data is paTed from ihe 
tag data register through 3 levels of muxing (levell. Ievel2, leveI3) where it enters the RSO/RSl encoders 
(which ^ now in a straight through mode (i.e. conm>U and contnl_7 are zero hence the data passes 
sta^ht from the input to the output). The MSBs of the etd_M>r_fldr must be high to store this data as code- 

etd_bufjwitch> This state is used to set the td^,alid signal and pulse the etd_fut,_tag signal which in turn 
IT^l the read write «nse of the ETDi buffers {y^shd). ■V^fir.nime signal is used to identify 
the first tune a tag is encoded If zero it means read the tag data from the RTDi FIFO and encode Once 
CTCoded and stored the FSM retun^ to this state where it evaluates the sense of td^,alid. Firet time around 
lu^^i w sets tdvalid and returns to the readtagdata state to fill the 2nd ETDi buffer. After this 

Oie FSM returns to this state and waits for the lastdodntag signal to arrive. In between tags when the last- 
doungtag signal is leceived the etd_adv_tag is pulsed and the FSM goes to the readtagdata state. However 
If the lastdotintag signal arrives at the end of a line there is an extra 1 cycle delay introduced in generating 
the etd^ady_tag pulse (via etd_adv_tag_endofiine) due to the pipelining in the TFS. This allows all tte 
previous tag to be read from the correct buffer and seamless transfer to the other buffer for the next line. 
readtagdata:- The readtagdata state waits to receive a nrfova// signal from the raw tag data interface which 
indicates there is raw tag data available. The tagJUua register is 128-bits so it takes 2 pulses of the rtdrd 
^"^^Tt 2'64-bits into the tag_data register. If the rtdavaU signal is set rtdni is pulsed for 1 cycle 

and the FSM steps onto the loadtagdata state. Initially the Bsigfirst64bits wUl be zero. The 64-bits ofrtd 
are assigned to the tag_fytaf63. 0J and the ^a%fi,st64bits is set to indicate the first raw tag data read is 
complete The FSM then steps back to the readLtagdata stale where it generates the second rtdrd pulse 
tag ^fTls JS^ ^ loadtagdata state for where the second 64.bits of rawtag data are assigned to 

i^^gdMo:- The loadtagdata state writes the raw tag data into QK.tag_data register from the RTDi FIFO. 
Thisfi-stMbUs flag IS reset to zero as the tag_fiata renter now contains 120/1 12 bits of variable data A 
decode of whether to (15:5) or (1 5:7) RS encode or 2D decode this data, decides the next state. 

"-'V"'..^^ K->5_5 (Reed Solomon (15:5) mode) state either encodes 40-bit Fixed data or 120-bit 
Vanable data and provides the encoded tag data write address and write enable (etd_M- adr and etdwe 
respective^- Once die fixed tag data is encoded the donefixed flag is set as this only needs to be done once 
per page The variabledate^resent register is then polled to see if there is variable data in the tags If there 

Ri'^u ^^'^Jf/i"^"* '^TDi and loaded into the tag_data register. 

Else die tdvalid flag must be set and FSM returns to the idle state. control_5 is a control bit for the RS 
tncoder and controls feedforward and feedback muxes that enable (15:5) encoding. 
The n!_15_5 state also generates the control signals for passing 120-bits of variable tag data to the RS 
encoder m 4-bit symbols per clock cycle. rs_cowxter is used both to control the levelljnux and act as the 
1 5-cycIe counter of the RS Encoder. This logic cycles for a total of 3* 15 cycles to encode the 120-bits. 

« /5_7:. The re_l 5_7 state is similar to the rs_l 5_5 state except the levell_mux has to select 7 4-bit sym- 
bols instead of 5. 

'l^l^ff-^'^-^^^^ The decode_2d states provides the control signals for passing the 

120.bit vanable data to the 2D decoder. The 2 Isbs are decoded to create 4 bits. The 4 bits from each 
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decoder are combined and stored in the ETDi. Next the 2 MSBs are decoded to create 4 bits. Again the 4 
bits from each decoder are combined and stored in the ETDi. 

As can be seen from Figure 161 on page 386 there are 3 stages of muxing between the Tag Data register 
and the RS encoders or 2D decoders. Levels 1-2 are controlled by levell^mux and Ievel2_mux which are 
generated within the TDi FSM as is the write address to the ETDi buffmletd^wr^adr) 

Figures 1 67 through 172 illustrate the mappings used to store the encoded fixed and variable tag data in the 
ETDI buffcre. 
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dO to d9 are encoded and stored 
during cycles N to N+i4 
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di4d|3 di2dii dtp 
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Figure 167. Mapping of the tag data to codewords 0-7 
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dOtD d9 aro encoded and stored 
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P19P9 






PiePe 




/I 


P17P7 


1 P» P8 P7 Pe Ps P4 P3 P2 Pi PO <«4d3<l2d,do I 




PisPe 






P15P5 

Pl4 P4 






P13P3 


\ Pl9 Pl8 Pt7 Pte PfS Pl« Pl3 Pl2 Pi 1 PlO <*9 ^'S <i7 <% <«S / 




P12P2 






Pii Pi 






P10P0 










\^ 








d7d2 










30 




codeword? J y 

oodeworde * 



cDoeworag » 

Figure 16«. Coding and mapping of uneoded Fixed Tag Data for (15.5) RS encoder 
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Figure 169. Mapping of pre-coded Fixed Tag Data 
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Figure 170. Coding and mapping of Variable Tag Data for (15.7) RS encoder 
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wnidr(S:0) 
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Figure 171. Coding and mapping of uncoded Fixed Tag Data for (15,7) RS encoder 
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Figure 172. Mapping of 2D decoded Variable Tag 
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26.7.6 Reed Soromon (RS) Encoder 



26.7.7 Introduction 



A Reed Solomon code is a non binaiy. block code. If a symbol consists of m bits then there are a = 2" dos 
s.ble symbols defining the code alphabet. In the TE. m = 4 so the number of possib" is q = 1 6 

Ac. t^l^^.^ 'f*' ^ infonnation symbols and n code-word symbols. RS codes have 

the property that the code word n is limited to at most q+1 symbols in length 

• TE_dataredwi = 0 and TE_decode2den - 0. then use die (1 5^) RS coder 

♦ TE_dataredun = 1 and TE_decode2den •= 0, then use the (1 5,7) RS coder 

!!!^ ^^^'^f^ with m = 4. k 4-bit information symbols ^pUed to the coder produce 1 5 4-bit code- 

;^:tr^i^piis°Son"s*^^^'^^^ 

A simple block diagram can be seen in. 



^ 2 H k 

|iM|ii4n|g|3m-|ii!fnNMigwni — ^ 



RS (n.k) encoder 
symbol size m=4 



1 2 n-l n 

HJiiiiigmii --- n i jjiaumgmi f 



Figure 173. Simple block diagram lor an m=4 Reed Solomon Encoder 

26.7.8 I/O Specification 

A I/O diagram of the RS encoder can be seen in. 



pdk 



rs_data_ln(3:0J 



enable 



T^dataredun 



Reed Sotomon &)ooder 



is,data,y f3:0] 



Figure 174. RS Encoder I/O diagram 

26,7.9 Proposed implementation 

In the case of the TE. (15.5) and (1 5,7) codes are to be used with 4-bits per symbol. 

The primitive polynomial is p(x) - x'* + x + 1 

In the case of the (1 5,5) code, this gives a generator polynomial of 

g(x) = (x+a)(x+a2)(x+a3Xx+a^)(x+a5)(x4^<5j(^^^7j^^^^8)^ 
g(x) - k'^ + aV + a^x« + a^x^ + aV + a^ V + a^x^ + ax^ + aV + ax + a^O 
gOO-x^+ g9X^ + gsx8 + g^x^ + g^x^ + g5X^ + g^x^ + g3x3 + g^x^ + g,x + go 
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In the case of the (1 5,7) code, dus gives a generator polynomial of 

Kx) = ('t-HiXx+a2)(x+a5Xx+a^Xx+a5Xx+a«)(x+a'Xx+a«) 

h(x) - x« + a"x7 + a2x« + aV + aV + a> V + a^x* + a> 'x + a* 

h(x) = x8 + hTx' + hfiX* + hjX* + h4X* + hjx' + hjX^ + h,x + ho 

The output code words are produced by dividing the generator polynomial into a polynomial made up 
&om the mput symbols. ^ 

This division is accomplished using the circuit shown in Figure 1 75. 

Oon!ioL7 



cantrot_J — 3\ 
COntR)L«— J/ 

T^_datar«diin 
.(mas 




ooribol_5 
miijc2|t 







C 














































(9^{tenoeas an muftipSerthat 

®<Jenotea aA adder «hat 
a<lds Galots Field «lement» 



m.daiOiO:0} 




codeword 
symbols 



Figure 175. (15,5) & (15,7) RS Encoder block dfagram 

The data iuthe circuit are Cialois Field elements so addition and multiplication axe perfonned using special 
ciicuitry. These are explained in the next sections. 

T^c RS coder can operate either in (15,5) or (15.7) mode. The selection is made by the register 
TE^dataredun sndTE^decodelden, 

When operating in (15,5) mode controlJF is always zero and v^iien operating in (15,7) mode control 5 is 
always zero. " 

Firstly consider (1 5,5) mode i.e. TEJLataredun is set to zero. 

For each new set of 5 input symbols, processing is as follows: 

The 4-bits of the first symbol do are fed to the input port rsJtataJnC^'.Qi) and controU is set to 0. mux2 is 
set so as to use the output as feedback. control_5 is zero so mux4 selects the input (rs^daeajn) as the out- 
put irs data-out). Once the data has settled (« 1 cycle), the shift registers are clocked. Tlie next symbol 
/ IS then ^phed to the mput, and again after the data has setded the shift registers are clocked again This 
^ reputed for the next 3 symbols rf^. ds and d^. As a result, the first 5 outputs are the same as the inputs 
After 5 cycles the shift registers now contain the next 1 0 required outputs. contn)L5 is set to 1 for the next 
10 cycles so that zeros are fed back by mux2 and the shift register values are fed to the output by muxJ 
and mfa>/ by simply clocking the registers. 
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A timing diagram is shown below« 




controLS 



conJrof_7 



Figure 176. (15,5) RS Encoder timing diagram 

Secondly consider (15,7) mode i.e. TE^dataredun is set to one. 

In this case processing is similar to above except that consrx}! 7 stays low while 7 cvmK«ic rw ^ j ^ 

cycles. co/i/^/_7 is set to 1 and the contents of the shift registers are fed to the output . 
A timing diagram is shown below. 



cfk 



re.<lat^_otitt3:0} 
rauoountsr 
TE.dataredun 

00f11fOt_j5 

oorttrol_7 




Rgure 177. (15.7) RS Encoder timing diagram 
The enable signal can be used to start/reset the counter and the shift registers. 

lifn'^ *^ ^ <i«iigned so that encoding starts on a rising enable edge. After 15 symbols have 

be^^Sh^;^ ^ ""^^ * ^'-^'^ »<^« « '^^--^ AS a r4.t there 

^^"t^^^r^rr *f ^"'^'''18*'^^ ^^^^.'^ ^ei^^ are reset and encoding will proceed untU it is 

SSSaTa'rat^Srr^ST ^ T^'"^ "^"^ continuously 

ouiput at a rate of 1 symbol per cycle, even over a few codewords. 

Alternatively, the RS encoder can request data as it requires. 

The performance criterion that must be met is that the following must be earned out within 63 cycles 

• load one tag's raw data into TELffl^^fate out wiuun cycles 

• encode the raw tag data 

• store the encoded tag data in the Encoded Tag Data Interface 

Aal^! ^Jl^^ "^""I^"^ the start of a page, there is no definite performance criterion except 

that It should be encoded and stored as fast as possible. 
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26.7.10 Galois Field elements and their representation 

A Galois Field is a set of elements in which we can do addition, subtraction, multiplication and division 
without leaving the set. 

The TE uses RS encoding over the Galois Field GF(2^ There are 2^ elements in GF(2^) and they are gen- 
crated usmg the primitive polynomial p(x)=x^+x + 1. ^ & 

The 1 6 elements of GF(2*) can be represented in a number of diflFerent ways. Table shows three possible 
representations - the power, polynomial and 4-tuple representadon. 

TaWe 129. GF(2^) representaUons 
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(1 00 0) 
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X + X^ + X3 


(0 111) 


o« 


l+X + X^ + X^ 


(1111) 


a" 


1 i-X^ + X^ 


(10 11) 


a" 


,1 ^^x^ 


(1001) 



26.7.11 Multiplication of GF(2*) elements 

The multiplication of two field elements cc* and a** is defined as 
of^ = a*.a^ = Q(a+b)niodulo 15 

Thus 

So if we have the elements in exponential form, multiplication is simply a matter of modulo 15 addition. 
If the elements are in polynomial/tuple form, the polynomials must be multiplied and reduced mod x* + x 

Suppose we wish to multiply the two field elements in GF(2^): 
a* = a3X^ + ajx^ 4- ajx' + 
= bax^ + hzx^ + b,x' + b© 
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where a^, bj are in the field (0,1) (i.e. modulo 2 arithmetic) 
Multiplying these out and using + x + 1 = 0 we get: 

a*'^'' = [(aobj + aib2 + ajb, + ajbo) + ajbjjx^ 

+ [(aob2 + ajbi + a2bo) + ajbj + (a3b2 + ajba)^^ 
+ [(Bob, + a^bo) + (ajbj + ajbj) + (aib3 + ajbj + a3b|)]x 
[(aobo + a,b3 + a2b2 + a3b,)] 
a*"^ = (aobs + aibj + asbj + a^(bo + b3)]x^ 

+ [aob2 + a|b, + a2(bo + bj) + a^Qy2 + bj) Jx^ 
+ (aobi + a,(bo + bj) + a2(b2 + b3) + a^Qj^ + h^) ]k 
+ [aobo + aib3 + a2b2 + ajbi] 

If we wish to multiply an arbitrary field element by a fixed field element we get a more simple form. Sup- 
pose we wish to multiply a*^ by a^. 

In this case = x^ so (aO al a2 a3) = (0 0 0 1), Substituting this into the above equation gives 

cit*^*(bo + b3)x^ + (b2 + b3)x^ + (b, +b2)x + b, 
This can be implemented using simple XOR gates as shown in Figure 178 

»b bj b, bo -a* 







1 


r 


-*< 










r y 





Cft C2 ci -aP** 

® «mustwB OR gat« 

Figure 178. Circuit for multiprying by 

26,7.12 Addftion of GF{2*) elements 

If the el^ents are in their polynomial/tuple form, polynomials are simply added. 
Suppose we wish to add the two field elements in GF(2'*): 

oc* = a3X^ + a2X^ + ajx + ao 

ct^ = b3X^ + b2X^ + bix + bo 
where a^, bj are in the field (0,1) (i.e. modulo 2 aridunetic) 

a<^ = a* + a^=(a3 + b3)x^ + (a2+b2)x2 + (ai+bi)x + (ao + bo) 
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Again this can be implemented using simple XOR gates as shown in Figure 1 79 



^ Ujit' ^ 



0 exdusfvia OR gate 



Figure 179. Adding two field elements 



26.7.13 Reed Solomon Implementation 



Consider the multiplication 

or in terms of polynomials 

(a3x3 + a^x^ + a,x + ao).(b3X^ + b^x^ b,x + bo) ^ (cjx^ + c.x^ + c,x + 0.) 



Table 1 30 gc multiplied by ail field elements, expressed in temis of 




the following signals are required: 
• ^)»bl.b2,b3. 
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• ( bo+b,). (bo+bj). (bo+bj). (bi+bj). (bj+ba). (bj+bj). 

• (bo+bi+b2),(bo+bj+b3),(bo+b2+b3),(b,+b2+b3). 

• (bo+bi+bj+bj) 

The implementation of the circuit can be seen in Figure . The main components are XOR gates. 4-bit shift 
registers and multiplexers. 

The RS encoder has 4 input lines labelled 0.1,2 & 3 and 4 output lines labelled 0.1.2 & 3. This labellimr 
corresponds to tiie subscripts of Ae polynomial/4-tt.ple representation. The mapping of 4-bit symbob 
from the TE_tagdata raster mto the RS is as follows: ~rr o j 

- the LSB in the TE.tagdata is fed into lineO 

- the next most significant LSB is fed into linel 

- the next most significant LSB is fed into Iine2 

- the MSB is fed into line3 

^» r^"* "^S^ u the Encoded tag data interfece is similiar. Two encoded symbols are stored in 
an 8*bit address. Within these 8 bits: 

- lineO is fed into the LSB (bit 0/4) 

- linel is fed into the next most significant LSB (bit 1/5) 

- Iine2 is fed into the next most significant LSB (bit 2/6) 

- lines is fed into the MSB (bit 3/7) 
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Figure 180. RS Encoder Implementation 



rs_data_out(3:0} 
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26.7.14 2D Decoder 

The 2D decoder is selected when TE_decode2deii = 1. It operates on variable tag data only its fimction is 
to convert 2-bits into 4-bits according to Table 131. g oaa oniy. its nmction is 

Table 131. Operation of 20 decoder 



ml 




00 


000 1 


01 


0010 


1 0 


0100 


1 1 


1000 



26-7.1 5 Encoded tag data interface 



l^^^^^^J^^ ^'""^f ' '^''"'^ ^ ^^^^^ ^""^ ^ ^ ^'"""^ '^^^^ and an encoded variable 
tag data store interface, as shown in Figure 181. wic 




advTag 



etdl 



etdO 



Figure 181. encoded tag data interface 
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The two reord units simply reorder the 9 input bits to map low-order codewords into the bit selection com- 
ponent of the address as shown in Table 132. Reordering of write addresses is not necessary since the 
addresses are already in the correct format. 

Table 132. Reord unit 



m 


WWW? 

^im 






mm 


A 


select 1 of 8 codewDrds 


A 


select 1 of 4 codeword tables 


nm 


B 


B 




C 


D 


select 1 of 15 symbols 




D 


select 1 of 15 symbols 


E 


IM 




F 


\m 


F 


G 


G 


C 


select 1 of 8 bits 


H 


select lot 4 bits 


H 




• 


1 



The encoded fixed data interface is a single 15 x 8-bit RAM with 2 read ports and 1 write port. As it is only 
written to during page setup time (it is fixed for the duration of a page) there is no need for simultaneous 
lead/wnte access. However the fixed data store must be capable of decoding two simultaneous reads in a 
single cycle.Figure 1 82 shows the implementation of the fixed data store. 



njAdrO , 



wrAdr I 



eftwB I 



rtAdrl 




outo 



0Ut1 



Rgure 182. encoded fixed tag data interface 

The encoded variable tag data interface is a double buffered 3 x 15 x 8-bit RAM with 2 read ports and 1 
write port. The double buffering allows one tag*s data to be read (two reads in a single ^cle) while the 
next tag*s variable data is being stozed. Write addressing is 6 bits: 2 bits of address for selecting t of 3, and 
4 bits of address for selecting 1 of 15. Read addressing is the same with the addition of 3 more address bits 
for selecting I of 8. 

Figure 1 83 shows the implementation of the encoded variable tag data store. Double buffering is imple- 
mented via two sub-buffers. Each time oaAdvTag pulse is received, the sense of which sub-buffer is being 
read firom or written to changes. This is accomplished by a 1-bit flag called wrsbO. Although the inidal 
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advTag 




^ outO 



^ outl 



Figure 183. Encoded variable tag data interface 



3(10 bag) •"co^tf varfabte tag data tub btiffar 




outO 



^ outl 



Figure 184. Encoded variable tag data sub-buffer 
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26.8 Tag Format Structure (TFS) Interface 



26.8.1 Introduction 

The TFS specifies the contents of every dot position within a tags border i.e.: 

• is the dot part of the background? 

• is the dot part of the data? 

The TFS is broken up into Tag Line Structures (TLS) which specify the contents of every dot position in a 
particular Ime of a tag. Each TLS consists of three tables - A, B and C (see Figure 1 85). 

For a given line of dots, all the tags on that line correspond to the same tag line structure. Consequently for 
a given line of output dots, a single tag line structure is required, and not the entire TFS. Double buffering 
allows the next tag line structure to be fetched from the TFS in DRAM while the existing tag line structure 
IS used to render the current tag line. 

The TFS interfece is responsible for loading the appropriate line of the tag format stmcture as the tag 
encoder advances through the page. It is also responsible for producing table A and table B outputs for two 
consecutive dot positions in the current tag line. 

0 31 



TE^tfsstartadr^ 



Tag Format Structure 
fortagX 



TTie number of M lines 
In a Tag > 



T^tfsendadr 



TLS X_0 



TLSX^I 



TLSX-2 



TLSX^n 



TLSX±10 



TLSX+1 1 



TlSX-f1_2 



TLS X-H_n 



TabTeA 

<384 entries X 2^118) 



TatMOB 

8 X 32-blts«28B^>tts 

(32 ftnfri»n » ftiytfc\ 



24 



32 



^ 0 9 10- T- - • - -31 



TatHeC 
l0-t>its 

(2 entries x 5-blt8) 



224)jt8 loservad and unused 



Figure 185. Breakdown of the Tag Fomiat Structure 

• There is a TLS for every dot line of a tag. 

• All tags that are on the same line have the exact same TLS. 

• A tag can be up to 384 dots wide, so each of these 384 dots must be specified in the TLS. 

• The TLS information is stored in DRAM and one TLS must be read in to the TFS Interface for each 
line of dots that are outputted to the Tag Plane Line Buffers. 

• Each TLS consists of 17 64-bits words. This is read from DRAM as 5 times 256-bit words with 192 
padded bits m the last 256-bit DRAM read. 

26.8.2 I/O Specification 

Table 133. Tag Format Structure Interface Port Ust 







mmsmmmmmm 


PCIK 


In 


SoPEC system dock 


prst^n 


In 


Active-low. synchronous reset in pclk domain 
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Table 133. Tag Format Structure Interface Port List 



^^^^^^^^^^^^^^ 

j top_go 
DRAM 
1 diu_data[63:oj 
1 dlu_tfs^rack 
j diu_tfs_rvalid 


In 

In 
In 
In 




1 tfs_diu_rreq 

1 tfs_dlu_radfI21:5J 

1 tag encoder top level 

1 top_advtagIine 

1 top^tagaftsense 

1 topJastdoUntag 


Out 
Out 

In 
In 

In 


braia vaiio from UHAM 

Read request to DRAM 

1 Read address to ORAM 

Pulsed after the last line of a row of tags 
For even tag rows = 0 l.e. 0;2.4.. 
For odd tag rows = 1 Le. 1,3^.„ 


1 top_dotposvafjd 

[ top_tagdotnufii[7^>J 

I tfsLvalid 

LtfsLta_dotO[l,-oj 

LtfsLta,d<rt1[1:0J "~ 

1 tag encoder top revel (PCU read 4 

1 tfe-.te_tfsstartadrt23.-0} 

f~tfe_te.tf8endadi(23:0] 

1 tfsjte_tfsfirsteneadr(23:0] 

1 tf8jle_cufrtfsadi{23.-0] 

1 TDI 

1 tfsLtdLadfO(8:0J 
tfeLtdLadni8:0j 


In 
In 

Out 
Out 
Out 

decoder) 
Out 

Out 
Out 
Out 

Out J 
Out 1 


Last dot in tag Is currently being piocessed 

Cuirent dot position Is a lag dot and its stojcture data and tag data Is 
available 

Counts from zero up to TE^tagmaxdotpairs {min. =i, max. = 192) 

TLStaWesA.BandC. readyforuse 

Even entry from Table A correspon<fino to top.tagdbtnum 
Odd entry from Table A corresponding to top tagdotnum 

1 FS trsstartadr register 
TFS tfeendadr register 
TFS tfsfirstfineadr register 
TFS currtfeadr register 

Read address tor dotO (even dot) 
Read address for doti (odd dot) 
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26.8.2,1 State machine 

The state machine is responsible for generating control signals for the various TFS table units, and to load 
the appropriate line from the TFS, The states are explained below. 

idle> Wait for top^o to become active. Pulse advjfsjine for 1 cycle to reset tawradr and tbwradr regis- 
ters. Pulsing advjfsjine will switch the read/write sense of Table B so switching Table A here as well to 
keep things the same i.e. YfrtaO = \iQfl{ymaO\ 

diu_access> In the diu.access state a request is sent to the DIU. Once an ack signal is received Table A 
vmte enable is asserted and the FSM moves to the tlsjoad state. 

tlsJoad> The DRAM access is a burst of 5 256-bit accesses, ultimately returned by the DIU as 
5*(4*64bit) words. There wiU be 192 padded bits in the last 256-bit DRAM word. The first 12 64-bit 
words reads are for Table A, words 1 2 to 1 5 and some of 1 6 are for Table B while part of read 1 6 data is for 
Table C The counter readjium is used to identify which data goes to which table. The table B data is 
stored temporarily in a 288-bit register until the ds.update state hence tbwe does not become active until 
read_num» 16). 

• The DIU data goes directly into Table A (12 ♦ 64). 

• The DIU data for Table B is loaded into a 288-bit register. 

• The DIU data goes directly into Table C. 



tlsjipdate> The 288-bits in Table B need to written to a 32*9 buffer. The tls.update state takes care of this 
using the readjvum counter. 

tlsjiext,' This state checks the logic level tfsvalid zxm\ switches the read/write senses of Table A (uTtaO) 
and Table B a cycle later (using the advjfsjine pulse). The reason for switching Table A a cycle early is 
to make sure the topjevel address via tagdotnum is pointing to the correct buflFer, Keep in mind the 
topjevcl is working a cycle ahead of Table A and 2 cycles ahead of Table B, 

IfifsValid is 1, the state machine waits until the odvTagLine signal is received. When it is received, the 
state machine pulses advTFSLine (to switch read/write sense in tables A, B, C), and starts reading tiie next 
line of the TFS from currTFSAdr, 

If tfsValid is 0. the state machine pulses advTFSLine (to switch read/write sense m tables A, B, C) and then 
jun^s to the tls_tfsvali4_set state v^ere the signal tfsValid is set to I (allowing the tag encoder to start, or 
to continue if it had been stalled). The state machine can dien start reading next line of the TFS from 
currTFSAdr. 

tls_tfsvalid_next:- Simply sets the tfsvalid signal and returns the FSM to the diu.access state. 



If an OdvTagLine signal is received before the next line of the TFS has been read in, tfsValid is cleared to 0 
and processing continues as outlined above. 
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The TFS state flow diagram is shown in below.. 



S5 



^ idle ^ 



too no 



top adi/teqffpft — 1 



diu^access ^ 



^ tisjoad ^ 



leaiLouoLslg 



^ tfs^update ^ 



cea(Louin»^ 



^ tisjext ^ 





tfe valid ^0 


2 





- ^tls_tfsvaPd_8et^ 



26^.3 



Figure 186. TFSI FSM State Plow Diagram 
Generating a tag from Tables A, B and C 

'^'^:'^ZT^JZ^ttt^^: 1?^° tag's bounding box. Each entry specifies 
and variable). background pattern or part of the tag's data component (boA fixed 

known asaTagUneStnltture. ^ ' -^"tnes that specify a single dot-line of a tag are 




Doc: SoPEC^hardware design 
Version: 2.3 " ^ 



29 Nov 2002 
Page 41 4 



SoPEC : Hardware Design 



S3 



Each ou^ut dot value is generated as follows: Each entry in Table A «r t uv u-^ . 

These 2-bits are interpreted according to Table . Table S^lS,le ' ^'^'^ 



Table 134. IntetpretaUon of bItO from entry In Table A 







mm 










0 


the output bft comes directly from bit1 (see 1 


able). ^^^^^^ 


1 


^^V^w'^ff-I'^'" conjunction with Tag Uae 
Structure Table B to determine which data bit wiO be output 



Table 13S. Interpretation of bHI from entry In table A when bItO = ( 



roretzu 



output 0 
output 1 



L^if .^^^ _ "^""^ <abte A when bItO = 1 



output data bit pointed to by current index into Table B 



thzl 



oulput date bH^hted to by current index into Table B. and advance ind^rTTT 



Therefore, up to 32 diffeKnt^^tsZ^^^i P^*^ « *^ in order of appearance, 
will be given by the S sS^dt^tTflTf ^^Ir 1* of the first data dot in a tag 

wHl advance thLgh the variS^tbTc B 2t2« 

^to^^'L^t^l^^ru^r^^^ 

address decoding, the addresses aie te«So„ th! "i*"' * '^^^ «d 

9-bit addresses: a« based on the RS encoded tag data. Table lists the interpretation of the 



Table 1 37. Interpretation of 9-blt tag data address In Table 




8 



Select t of a codewords. 
Codewords 0,1 . 2. 3. 4. 5 are variable data. 
Codeworels 6. 7 are fixed data. 
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Table 137. Interpretation of 9-blt tag data address In Table B 




Select 1 of 15 symbols (11 11 Invafld) 



Select 1 of 4 bits from the selected symbols 



If the fixed data ,s supplied to tbe TE in an unencoded form, the symbols derived from codeword 0 of fixed 
V7 ^ 'y^^''^' '^^rived fix^m fixS data codewoidTSsl^tten to Jo£ 

word 7 The data symbols are stored first and then the remaining n:dundancy s^b^^^Ld ^ 

are written to symbols 0-4, and the redundancy symbols are written to symbols 5-14 When 7 data svmhoU 

^^^"'S^ ^ '"PP"^^ '° the TE in a pre-encoded fonn. the encoding could theoretically be 

anythmg. Consequently the 120 bits of fixed data is copied to codewords 6 and 7 as sho«l* xlSe m 

""T* "^"^ ? codeword/symb ols when no redundancy encoding 









0-19 


0^ 




20-39 


0-4 


7 


40-59 


6-9 


6 


60-79 


S-9 


7 


80-99 


10-14 


6 


100-119 


10-14 


7 



It is important to note that the interpretation of bitl from Table A (when bitO = 1) is relative A 5-bit ind» 
«art ^ 4e first dot m U»e tag. an initial value for die index into Table B is needed. SuSi^t te^on Z 
Se^3^2^hfr H "^l^f any partial tag at the end of a line s^Ty 
before flie cntoe tag has been rendered. The initial index required due to the rendering of a oartid tae ^ 

possible mitud indexes since there are effectively two types of rows of tags in terms of initial off^eT 
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26.8.4 Architecture 

A block diagram of the Tag Foirnat^Structure Interface can be seen In Figure 187, 




^taOdd 
^taEvan 



dotsRosVklid 



^etdRdAdrO 



etdRdAdn 



Figure 187, TFS Block Diagram 
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26.8.4.1 Table A interface 



?el!rf iTw^^ ''^p^^!^'' TJ^ ^ ^^^^ "^^^^ of <^ntrol logic, as shown in 
Mgure 188. While one RAM is read from for the current line's table A rfate M hit^ «. ^'^^ . ^ 



aJvTFSLfne 



taRdAdr 



AdrGen 



I I 



tfataln 



64 



Table A 
interface 




adr. 



datain 



16x64.bits 
tabJeA(O) 



adr 



jtetatn 



16x64-b}ts 
tableA(i) 



ta.d<HO_lcycleIater 
(^<lot]^lcycleialer 
► 



2 fbto 1&0) taEven 
' ► 



3&2) 



tggdcf 



Figure 18«, Table A interface block diagram ' 

^Z^lJil^t^A^ r'''^ "^^^ ^ P^^^ 2 cycles after the 

;Jc ^ . ^""^^^ Pipehmng m the TFS ftom registering Table A and Table B outputs 
hence this extra registering stage for the generation of ta^dotO^lcyclelat^and ta_dotl_lcyclelater 

^^1?%^ ^rfwTF^Xme pulse is received, the sense of which RAM is being read from or written to 
changes. This is accomplished by a 1 -bit flag called ^aO. Although the initial sLe of ^Zo is ^ll^t 
It must mvert unon rece nt of an a^uTV'^fs^ * a t... , „ . irrejevant. 




advTFSLina - 



wrtaO 
(1 bft) 



table A 
address gen 



' ^TVflCft^ taWrAdr 
" W> ^ (4 bits) 



WrtaO 



taWrAdr 



Figure 189. Table A address generator 
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26.8.4.2 Table C Interface 

A block diagram of the table C intetfece is shown below in Figure 190. 



(asAltSensa 



ad^SUne 



da(n>s\«tid 




Figure 190. Table C interface block diagram 

T^e address generator for table C contains a 5 bit address register adr that is set to a new address at the 

b«;fSl Si L^f, n**.*™! B generated 

!i™ ^ inSection 139. the output address ibRdA^^ 

^r^^' « one of and a*^y. and at the end of the cycleLr takes ^no^fotL" 



Table 139, AdrCen lookup table 
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1. X = don't care state. 



26.8.4.3 Table B Interface 



The table B interface implementation generates two encoded tag data addresses (tfsi adrO, tfsi adrl) 
based oa two table B input addresses {tbRdAdrO, tbRMdrl). A block diagram of ulble B am be s^Tin 



Figure 191. 



ttR dAdiOi 



ttRdAtfrli 



4- 

5. 



tbwe 



advTFSUne i 



read _num 



(from f FS FpM) 



tfataln> 



pdk 



64 



tbwradr 



AdrGen 



i 



tsbwe 



288^ 
tableB 
teflipreg 



=0 



LS>.adr1 ^1 
dataln. 



32K94)its 
tab(esul>B(0) 



J^dataln ^ 



tabtesubB (1} 



i adiO 



intwfeee 



Figure 191. Table B Interface block diagram 

ITL^^.^ "f"^^ ^T!ti^'°^: ^^^'^^ '^'^ ^ temporaor t<«ister via the TFS FSM. Once all 288- 
biLS^nlb^ • *^ ^ " 9A>it chunks to the 32*9 register aiia^ 

Each timean AdvTFSLine pulse \s received, the sense of which sub buffer is being read from or written to 
ht^' • ^ * J -bit flag called ^tO. Although the initial st^e^^ ^ i^^^ 

It must invert upon receipt of an AdvTFSUne pulse. •cvuai. 

Note:- The output addresses ftom Table B are registered. 
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27 Tag FIFO Unit (TFU) 



27.1 Overview 

The Tag HFO Unit (TFU) provides the means by which data is transferred between the Tag Encoder (TE) 
and the. HCU. By abstracting the buffering mechanism and controls from both units, the interface is clean 
between the data user and the data generator. 

The TFU is a simple FIFO interface to the HCU. The Tag Encoder will provide support for arbitrary Y 
mteger scalmg up to 1600 dpi. X integer scaling of the tag dot data is perfbnned at the output of the HFO 
m the TFU. There is feedback to the TE from die TFU to allow stalling of the TE during a line The TE 
mterfeces to the TFU widi a data width of 8 bits. The TFU interfaces to the HCU witfi a data width of 1 bit 
The depth of the TFU FIFO is chosen as 16 bytes so that the HFO can store a single 126 dot tag. 

27.1 .1 Interfeces between TE, TFU and HCU 



TE 



te_«n_«wlata 
ta_tfu_wdata alid 



t(u_te_olctow]its 




te_thj_wradv ins 



TFU 



8 



FIFO 



hcu^tfi 


_advdot 




_tdata . 


y 

tfu_fxa 


.avail 









HCU 



Figure 192. Interfaces between TE. TFU and HCU 

27.1.1,1 TE-TFU Interface 

The interface from the TE to the TFU comprises the following signals: 

• tejtfu_wdata, 8-bit write data, 

• tejtfii^wdatavalid, write data valid. 

• te^tfu^wradvline, accompanies the last valid 8-bit write data in a line. 
The interface from the TFU to TE comprises the following signal: 

• (/u_te_oktowrite, indicating to the TE that there is space available in the TFU FIFO. 

The TE writes data to the TFU HFO as long as the TFU's tfu^te^ohowriie output bit is set. The TE write 
will not occur unless data is accompanied by a data valid signal. 

27,1.1.2 TFU'HCU Interface 

The interface from the TFU to the HCU comprises the following signals: 

• t/ujicu_tdata, 1 -bit data. 

• tfujicu^avail data valid signal indicating that there is data available in the TFU FIFO. 
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The interface from HCU to TFU comprises the following signal: 
• hcu_tfit_ready, indicating to the TFU to supply the next dot. 



27.1.1.2.1 X scaling 



'•""'^ "^^"^ ^ '° <=°'^vert the final output to 

Siclt on S^i'.^? ^JSFU which support non-integer scaling, the scaling is integ^oiy 

Replication in the X direction is performed at the output of the TFU FIFO on a dot-by-dot basis 

case where there may be two SoPEC devices, each generating its own portion of a dot- 
TTO ^e^t tfn "f TV"^. "P'*'"'^ ^"'^ scale-factor number of times ly an individual 

Note two SoPECTEs may be involved in producing the same byte of output tag data straddling the print- 

LTdoiTn^r-.2l!^''y. f;'*' '''' fr"" the^nJamount of dZ™.Sng 

t?J™^ ^ T ^P'y "I*^ of SoPEC will be progSuned 

^^onect number of dots mto the tag and its output will be byte aUgned with the left edge of S^t- 
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27.2 Definitions of I/O 



Table 140. TFU Port Ust 









\#iocKs ona nesets 


pdk 


1 


In 


SoPEC Functional dock. 


pret_n 


1 


In 


Gtobal reset signal. 


Pcu intertace data and contrDj signals 


pcu_addr(3:2] 


2 


In 


PCU address bus. Only 2 bits are required to decode the 
address space for thrs block. 


pcu_dataout(3l:0] 


32 


In 


Shared write data bus from the PCU. 


tfu_pcu_datain{3l :0J 


32 


Out 


Read data bus from the TFU to the PCU. 


pcu_iwn 


1 


In 


Common read/not-write signal from the PCU. 


pcu_tfti_8el 


1 


in 


Block select from the PCU. When pcu^tfujsei is high both 
pcujaddraxvS pcu_,dataout are valid. 


lfu_pcu_rdy 


1 


Out 


Ready signal to the PCU. When tfu jKu_fdy\& high it Indi- 
cates the last cycle of the access. R)r a write cyde this 
means pa/.dataot/f has been registered by the block and 
for a read cyde this means the data on 1fu_pcujdatBln is 
vaBd. ~ 


TE tnterfece data and control signals 


te_tfu_%vdata[7X)] 


8 


In 


Write data for TFU FIFO. 


te_tfu_wdatavaDd 


1 


In 


Write data vaiid signal. 


te.tlii.wradvtine 


1 


In 


Advance toe signal strobed when the last byte in a line is 
placed on te_tfu^wdata 


tfu_te_<rictowrite 


1 


Out 


Ready signal indicating TFU has space available in ITs FIFO 
and is ready to be written to. 


HCU Interface data and control signals 


hcij_tfu_advdot 


1 


In 


Signal Indicating to the TFU that the HCU is ready to accept 
the next dot of data from TFU. 


tfu^hcu.tdata 


1 


Out 


Data from the TFU FIFO. 


tfu_hcu_avan 


1 


Out 


Signal Indicating vafid data available from TFU RFO. 



27,3 Configuration Registers 



Table 141. TFU Configuration Registers 





m 


mm 




M 




Control regtst 


— 


0x00 


Reset 




1 


1 


A write to this register causes a reset of 
ihe SFU. 

This register can be read to indtoate the 
reset state: 

0 - reset In progress 

1 - reset not in progress. 
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Table 141. TFU Configuration Registers 



mm 




p 












see 
text 


Writing 1 to this register starts the TFU. 
Writing 0 to this register hatts the THJ. 
When Go Is deasserted the state- 
machines go to their idfe states but all 
counters and configuration registers keep 
their values. 

When Go Is asserted all counters are 
reset, but configuration registers keep 
their values (i-e. they dont get reset). 
The TFU must be started before the TE is 
started. 

This register can be read to determine if 

the TFU is running 

(1 s running, 0 = stopped). 


Setup register 


» (ranstant during processing of page) — — — — 


0x08 


XScaie 


6 


1 


Tag scale egtctor in X direction. 


OxOC 


XRacScale 


a 


1 


Tag scale factor in X direction for the first 
dot in a line 


0x10 


TEByteCount 


12 


0 


The number of bytes to be accepted fix>m 
the TE per line. Once this number of bytes 
have been received subsequent bytes are 
ignored until there is a strobe on the 
to tfu ¥/m<Mlne 


0x14 


HCUDotCount 


15 


0 


The number of (optionally) x-scaled dots 
per line to be suppBed to the HCU. Once 
this number has been reached the remain- 
der of the current FIFO byte is ignored. 



27.4 Detailed description 



^Vo^'Sl.^^ T^'"^"' ^"^^ ? ^' «*s«l«ent bytes are ignored until them 

IS a strobe on the te_ffii_ymuivline signal, whereupon bytes for the next line are stored 

mo^io^t:mvSt^'"'''''^'?'f"* ""''""^ 0«« ^ « cached any 
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h^U^^""^ ^ ^'^^ ^^^'^ TFU and the TE and HCU is detailed 



FUbWrPtr 



te.tfu.data 3^^— ^ 



Fdb 



TZ 



I RdBit 



FMbRdPfir 



Figure 193. 16-byte FIFO In TFU 



// Concurrently Executed Code: 

// TE always aUowed to wxite when there's either (a) room or (b) no room and all 
// bytes for chat line have been received. 

" ti:!tS.'::::ritV"i""' ^ <^i^ccnt„t» ^.,0,^ ^ „ 

else 

tfu_te_oktowrite = 0 

// Data presented to HCU when there is (a) data in FIFO and (b) the HCU has not 

// received all dots for a line 

if (FifoCntnts !« 0) AND (BitToTx != 0)then 

tf\a_hcu_avail « 1 
else 

tfu^hcu^avail « 0 

// Output ffiux oC FIFO data 
tfu-.hcu_tdata « FifoCFifoRdPnt} (RdBit] 

// Se<zuentially Executed Code: 

" Fii;tKi?:s:^«"^Zt^Lc^ti""'^^^^^ " 

FifoWrPnt ♦•f 
FifoContents •*■•(- 
ByteToRx — 

if (te_tfu_wradvline == 1) then 
ByteToRx = TEByteCount 

if <hcu_tfu_odvdot == 1 and FifoCntnts != 0) then { 
BitToTx ♦+ 

if (RepFrac == 1) then 
RepFrac = Xscale 
if (RdBit = 7) then 

RdBit = 0 

FifoRdPnt 

FifoContents — 
else 

RdBit** 

else 

RepFrac - - 
if (BitToTx == 1) then ( 

RepFrac = XFracScale 

RdBit = 0 

FifoRdPnt ++ 

FifoContents- - 

BitToTx = HCUDotCount 

} 
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What is not detailed above is the fact that, since this is a ciirular buffer both the fifo r«,H 
ers wrap-around to zero after Ihey reach two. Also not detSS^ Tjk^^L J ^^^^^ 
the read and wnte-pointer in the same cycle, the fifo contents counter a^ZT^^L ^ °^ ^ 
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28 Haiftoner Compositor Unit (HCU) 

28.1 Overview 

The Haiftoner Compositor Unit (HCU) produces dots for each nozzle in the destination printhead taking 
account of the page dimensions (including maigins). The spot data and tag data are received in bi-level 
form while the pixel contone data received from the CFU must be dithered to a bi-level representation. The 
resultant 6 bi-level planes for each dot position on the page are then remapped to 6 output planes and out- 
(DNQ ^"^^^ ^ ^8* ^ P™^ pipeline, namely the dead nozzle compensator 



28.2 Data flow 



Figure 194 shows a simple dot data flow high level block diagram of the HCU. The HCU reads contone 
data froin the CFU, bi-level spot data from the SFU. and bi-level tag data from the TFU. Dither matrices 
are read from the DRAM via the DIU. The calculated output dot (6 bits) is read by the DNC 



contons RFO 
unft intertace 



ORAM 
inlerfeceunit 



4-^2 OdL. 



/'8 



Spot 
FIFO unit 
interim 



tag 
RFO unit 
interface 



Haiftoner / Coinposttor Unft 
1" 



dead 
nmzte 
oompensatof 



Figure 194. High level block diagram showing the HCU and its external interfaces 

The HCU is given the page dimensions (including margins), and is only started once for the page. It does 
not need to -be programmed in between bands or restarted for each band The HCU will stall appropriately 
If Its input buffers are starved. At the end of the page the HCU will continue to produce 0 for all dots as 
long as data is requested by the units farther down the pipeline (this allows later units to conveniently flush 
pipelmed data). ^ 



The HCU performs a hnear processing of dots calculating the 6.bit output of a dot in each cycle. The map- 
pmg of 6 calculated bits to 6 output bits for each dot allows for such example mappings as compositing of 
the spotO layer over the appropriate contone layer (typically black), the merging of CMY into K (if K is 
present m the pnnthead), the splitting of K into CMY dots if there is no K in the printhead. and the gener- 
ation of a fixadve output bitstream. 
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28.3 ORAM STORAGE REQUIREMENTS 



SoPEC allows for a number of different dither matrix configurations up to 256 bytes wide. The dither 
matrix is stored in DRAM. Using either a single or double-buffer scheme a line of the dither matrix must 
be read m by the HCU over a SoPEC line time. SoPEC must produce 13824 dots per Line for A4/Letter 
printing which takes 13824 cycles. 

Hie following give the storage and bandwidths requirements/or some of the possible configurations of the 



• 4 Kbyte DRAM storage required for one 64x64 (preferred) byte dither matrix 

• 6,25 Kbyte DRAM storage required for one 80x80 byte dither matrix 

• 1 6 Kbyte DRAM storage required for four 64x64 byte dither matrices 

• 64 Kbyte DRAM storage required for one 256x256 byte dither matrix 

Note that regardless of the width of the dither matrix, 256 bytes are always read from DRAM for each line. 
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Implementation 

A block diagram of the HCU is given in Figure 195. 




Figure 195. Block diagrain of the HCU 
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28.4.1 Definition of I/O 



Table 142. HCU port list and description 











UIOCK8 and reset 




pdk 


1 


1 


1 System dock. 


prst_n 


1 


1 


1 System reset, synchronous active low. 


PCU Interface 




pcu.hcu.sel 
pcu.rwn 


1 
1 


In 
In 


Block select from the PCU. When pcu^hcu sel is high t>om 
pcu^adrand pcu_dataaut are valid. 


pcu_adf(7:2J 


6 


In 


Common read/hot-write signal from the PCU. 

PCU address bus. Only 6 bits are required to decode the 
address space for this block. 


pcu.dataout[31:0] 


32 


In 


Shared write data bus from the PCU. 


1 ncu^pcu.ro/ 

1 


1 


Oul 


Ready signal to the PCU. When hcu_pcu^rcSy \B high It indicates 
the last cyde of the access. For a write cyde this means 
pc[/_cfateowf has been registered by the block and far a read 
_ cyde this means the data on hcu jxxi^data is valid. 


1 hcu.jxaj_data(31:0) 
1 DIU Interface 


32 


Out 


Read data bus to the PCU. 


j hcu.diu.rreq 


1 


Out 


HCU read request, active high. A read request must be accom- 
panied tjy a valid read address. 


1 dii|_hcu_racfc 


1 


In 


Acknowledge from DiU. active high. Indicates that a read 
request has been accepted and the new read acjUress can be 
placed on the address bus. hcu_diu^radr. 


1 hcu_diu_fadr(21:5} 


17 


Out 


HCU read address, 17 bits wide (256-bit afigned word). 


1 diu^hcu_rva]id 


1 


In 


Read data vaUd. active high. Indicates that valkf read data Is 
now on the read data bus, diu data. 


1 dlu_data(63X)] 


64 


In 


Read data from DIU. 


( CRI Interface 




1 cfu_hcu_avaH 

j cfu_hcu_cOdatar7:0] 


1 
8 


In 
in 


Indicates valid data present on cfiJ_hcu_c(3-0]data lines. 


1 cfu_hcu_c1datar7.'01 


8 


In 


Pixel of data in contone plane 0. 
Pixel of data in contone plane 1. 


1 cfu_hcu_c2data(7:0] 


8 


In 


Pixel of data in contone plane 2. 


j cfu_hcu_c3data(7.-0) 


6 


In 


Pixel of data In contone pfane 3. 


j hcu_cfu_advdot 


1 


Out 


Infonns the CPU that the HCU has captured the pixel data on 
cfU-hGu^c[3^}data lines and the CPU can now place the next 
pixel on. the data lines. 


1 SFU Interface 




1 8fij_hcu_avail 


1 


In 


Indicates valid data present on sfu hcu sdata. 


1 &fu_hcu_sdata 


1 


(n 


6i-level dot data. 


1 hcu.sfii.advdot 


1 


Out 


Infonns the SFU that the HCU has captured the dot data on 
sfu_hcu^sdata and the SFU can now place the next dot on the 
data line. 


1 ir-u interface — ^—^—^^^ 


1 nu_hcu_avaB 




1 


In 1 


Indfcates valid data present on tfu^hcu ftteea. 


1 tfu.hcu_tdata 


1 1 


In 1 


Tag dot data. 
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hcu_tfu_advdot 


1 


Out 


Informs the TFU that the HCU has captured the dot data on 
tfu__hcu_taata and the TFU can now place the next dot on the 
data line. 


DNC interface 


dnc_hcu_ready 


1 


In 


Indicates that DNC is ready to accept data from the HCU. 


hcu^dnc^avail 


1 


Out 


]n(£cat6s vaiid data present on hcu^dnc_data. 


hcu_dnc.data[5:0} 


6 


Out 


Output bi-leveJ dot data In 6 ink planes. 



28.4^ Configuration Registers 

The configuration registers in the HCU are programmed via the PCU interface. Refer to section 21 ,8.2 on 
page 257 for the description of the protocol and timing diagrams for reading and writing registers in die 
HCU. Note that since addresses in SoPEC are byte aligned and die PCU only supports 32-bit register reads 
and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the 
HCU. When reading a register thai is less than 32 bits wide zeros should be returned on the upper unused 
bit(s) of hcu^pcu^data. The configuration registers of the HCU are listed in Table 143. 



Table 143. HCU Registers 













Control reglstei 










rs 


0x00 


Reset 


1 


0x1 


A write to this register causes a reset of the HCU. 


0x04 


Go 


1 


0x0 


Writing 1 to this register starts the HCU Writing 0 to 
this register halts the HCU. 
When Go is asserted all counters, flags ete. are 
cleared or given their Initlat value, but configuration 
registers keep their values. 

When Go Is deasserted the state-machines go to their 
idle states but all counters and oonfiguiation registers 
keep their values. 

The HCU should be started aftorthe CFU, SFU, TFU. 
and DNC. 

This register can be read to detennine if the HCU Is 
running 

(1 = running, 0 = stopped). 


Setup reglsterB 


(constant for during processing) 


0x10 


AvailMask 


4 


0x0 


Mask used to determine which of the dotgen units etc. 
are to be checked before a dot is generated t>y the 
HCU vrithin the specified margins for the specified 
color plane. If the specified dotgen unH Is stalled, then 
the HCU will also stall. 

See Table 144 tbr Ijit aIk>cation and deffnltfon. 


0x14 


TMMask 


4 


0x0 


Same as AvailMask, but used In the top niargin area 
before the appropriate target page is reached. 


0x18 


PageMarginY 


32 


0x0000^ 
0000 


The first tine considered to be off the page. 


0x1 C 


MaxDot 


16 


0x0000 


This is the maximum dot number - 1 present across a 
page. Ft>r example if a page contains 13824 dots, 
then MaxDof will be 13823. 
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m 






0x20 

- 


TopMargin 


32 


0x0000. 
0000 


The first line on a page to be considered within the 
target page (or contone and spot data. (0 » first 
pnnted line of page) 


0x24 


BotlomMargin 


32 


0x0000^ 
0000 


The first line in the target bottom margin for contone 
and spot data (i.a. first line after target page). 


0x28 


LeftMargin 


16 


0x0000 


The first dot on a line within the target page for con- 
tone and spot data. 


0x2C 


RightMargin 


16 


OxFFFF 


The first dot on a line within the target right margin for 
contone and spot data. 


0x30 


TagTopMargin 


32 


OxOOOO_ 
0000 


The first line on a page to l>e considered within the 
target page for tag data. (0 = first printed line of page) 


0x34 


TagBottomMaroln 


32 


0x0000. 
0000 


The first One in the target bottom margin for tag data 
(i.e. first One after target page). 


0x38 


TagLeftMaiQin 


16 


0x0000 


The first dot on a line within the target page for tag 
data. 


0x3C 


TagRIghtMarg]n 


16 


OxFFFF 


The first dot on a One within the target right nnargin for 
lag data. 


0x40 


DMReadEnable 


1 


0x0 


1 if a dither matrfx Is spectfied 
0 if a dither matrix is not specified. 


0x44 


StartDMAdr 


17 


0x0_ 
0000 


Points to the first 256-bit word of the first line of the 
either matrix in DRAM. 


0x48 


EndDMAdr 


17 


0x0_ 
0000 


Points to the last Z56-brt word of the last line of the 
dither matrix in DRAM. 


vX40 


Unelncrement 


5 


0x2 


The number of 256-blt words in ORAM from the start 
of one line of the <fither matrix and the start of the next 
ifrte. i.e. the value by which the DRAM address is 

incremented at ttifi stArt Af a lina so that If nnrnte tn fho 
Start of the next line of the dither matrix. 


0x50 


DMinitlndexCO 


8 


0X00 


Initial Index within 256-byte dither matrix line buffer for 
contone plane 0. If using dout>te-buffer scheme, only 
the 7 Isbs are used. 


0x54 


DMLwrfndexCO 


8 


0x00 


Lower Index within 256-t>yte dither matrfx line buffer 
for contone plane 0. If using dout)le-buffer scheme, 
only trie 7 lsl>s are used. 


0x58 


OMUprindexCO 


8 


0X3F 


Upper index within 256-byte cBther matrix line buffer 
for contone plane 0. After reading the data at this 
location the index wraps to DMLwrtndexCO. If using 
double-buffer scheme, only the 7 Isbs are used. 


0X5C 


DMInitlndexCI 


8 


0x00 


Initial index within 256-byte dither matrix line buffter for 
contone plane 1. If using double-buffer scheme, only 
the 7 Isbs are used. 


0x60 


DMLwrlndexCI 


8 


0x00 


Lower index within 256-t>yte dither matrix line iMiffer 
for contone plane 1 . If using double-buffer scheme, 
only the 7 Isbs are used. 


0x64 


OMUprindexCl 


6 


Ox3F 


Upper index within 256-byte dither matrix line buffer 
for contone plane 1 . After reading the data at this 
location the index wraps to DMLwrlndexCI. If using 
douWe-tniffer scheme, only the 7 Isbs are used. 
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US 




0x68 


0MlnitlndexC2 


8 


0x00 


Initial index within 256-byte dither matrix line buffer for 
contone plane 2. If using double-buffer scheme, only 

the 7 Isbs are used. 


0x6C 


OMLwrlndexCa 


8 


0x00 


Lower index within 256-byte dither matrix fine buffer 
for cxsntone piane 2. If using double-buffer scheme, 
onlv the 7 Isbs art) u<ied 


0x70 


DMUprlndexC2 


8 


Ox3F 


Upper index wfthtn 2 56- byte dither matrix line buffer 
for contone plane 2. After reading the data at this 

llVMlttnn tHo inHAv u/ranc fl/Lfl Li/rf/l^(oy/^9 If i re»n/^ 
IVmLUUII UlV If IUbX WidfJo tU lyin^WflflUtiXK^^. II USillQ 

double-buffer scheme, only the 7 Isbs are used. 


0x74 


0MlnmndexC3 


8 


0x00 


Initial index within 256-tiyte dither matrix line buffer for 
contone plane 3. If using doulole*buffer s^eme, only 
the 7 Isbs are used. 


0x78 . 


DMLwrlndexC3 


8 


0x00 


Lower index within 256-byte dither n^trix line buffer 
for contone plane 3. If using double-buffer scheme, 
only the 7 Isbs are used. 


0x7C 


DMUprtndexCa 


6 


0x3F 


Upper index within 256-byte dither matrix line buffer 
for contone plane 3. After reading the data at thfe 
location the index wraps to 0MLwrtndexC3, If using 
double-buffer scheme, only the 7 Isbs are used. 


0x80 


DoubleUneBuf 


1 


0x1 


Selects the drther line Ixjffer mode to be single or dou- 
ble buffer. 


0x84 to 0x98 


lOMapptngLo 


6x32 


0X0000. 
0000 


The dot reorg mapping for output Inks 0 to 5. For each 
ink's 64-brt lOMapptng value, lOMapplngLo repre- 
sents the low order 32 bits^ 


OxSCtoOxBO 


lOMappingHt 


6x32 


OxOOOO_ 
0000 


The dot reorg mapping for output inks 0 to S. For each 

ink's 64'bit lOMapping value, lOMappingHi represents 

the high order 32 bits. 


0xB4 toOxCO 


cnConstaitt 


4x8 


OxQO 


The constant contone vahie to output for contone 
plane N when printing in the margin areas of the page. 
This value will typically t>e 0, 


0xC4 


sConstant 


1 


0x0 


The constant bhievel value to output for spot when 
printing in the margin areas of the page. This value 
•will typically be 0. 


0xC8 


tConstant 


1 


0x0 


The constant bi-level value to output for tag data when 

printing In the margin areas of the page. This value 
will typically be 0. 


OxCC 


OitherConstant 


8 


QxFF 


The constant value to use for dither matrix when the 
dither nrwitrix is not available, i.e. when the signal 
dm^avaU Is 0. This value wiO typically t>e OxFF so that 
qpCo/isfenf can easily be 0x00 or OxFF without requir- 
ing a dither matrix {OitherConstant \^ primarily used 
for threshoM dithering in the margin areas). 


Debug registers (read onJy) 
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m 




0x00 


HcuPbrtsDebug 


14 


WA 


Bit — tftt hfHt HUAH ' 

Bit 1 2 B hcujtSujst€Mot 
Bit 11 s sj^/iCLLava// 
Bit 10 = hcu_sfu_advdot 
Bit 9 s cft/_/}Cc/_avgi/ 
Bit 8 s ncu_cfu_advaot 
Bit 7 e dncjwujtsady 
Bit 6 a /lOLdrMLdi^r/ 
Bits 5-0 = hajuanc_data 


0xD4 


HcuDotgenDebug 


15 


N/A 


Bit 14 = afterjtop^margin 
Bit 13 = in^tagjtarget^aga 
Bit 12 e fn^targetpage 
Bit 11 »p_a)/ail 
Bit IO0 5Lava/y 
Bit 9 = cp_ayait 
Bit 8 = dn7_av9// 
Bit 7 = advdot 

Bits 5-0 s [p,s,cp3,cp2,cp1,cp€H 

(i.e. 6 tut input to dot reorg units) 


0xO8 


HcuDftherOebugl 


17 


WA 


Bit 9 ss advdot 

Bit 8 0 dm^avali 

Bit 1 5-6 = CP 1jamer_vai 

Bits 7-0 =5 epO^dhtteryal 


OxDC 


HcuOitherOebug2 


17 


N/A 


Bit 9 s adyd<a 

Bit 8 1 tfm^malf 

Bit 15^ = cp3Lfl«f}er.if9/ 

Bits 7-0 = cp^ditherjifaH 



28.4.3 Control unit 

The control unit is responsible for controlling the overall flow of the HCU. It is responsible for deteimin- 
ing whether or not a dot will be generated in a given cycle, and what dot will actually be generated - 
including whether or not the dot is in a margin area, and what dither cell values should be used at the spe- 
I cific dot location. A block diagram of the control unit is shown in Figure 196. 

^ The inputs to the control unit are a number of avail flags specifying whether or not a given dotgen unit is 
capable of sillying 'real* data in this cycle. The term 'real' refers to data generated from external 
sources, such as contone line buffers, bi-level line buffers, and tag plane buffers. Each dotgen unit informs 
the control unit whether or not a dot can be generated this cycle from real data. It must also check that the 
DNC is ready to receive data. 

The contone/spot margin unit is responsible for determining whether the current dot coordinate is within 
the target contone/spot margins, and the tag margin unit is responsible for determining whether the current 
dot coordinate is within the target tag margins. 

The dither matrix table interface provides the interface to DRAM for the generation of dither cell values 
that are used in the halftoning process in the contone dotgen unit 
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ok to r eadiok to write 



cp^avall. s .av^n. tp^avatl 



avalLfn asI^ tm_mask 



tn_tar9dUpa£^ 



in_page <4- 



Iil.taa_targ^jpag8 



determine 
acfvdot 



rd kdvdot wf.adwdot 



advdot 



position unit 



contone 
spot 

margin 
una 



^16 



tag 
margin 
unit 



■1^ 



16, 



dither 
matrix 
table 
interface 



> hcu.dnc^vaii 
— dnc_hcti_roady 



advdbt 
— max_dot 



» hcu^diu^radr 

<Jlu_hcu_rvaIkl 

■7^ — diu.data 



I t I 2 S s, s, 

a* 5 i s J ^ 5 1 1 3 ^, ^, ^, 
if I ^ •111 1 & ^ & 

Figure 196. Block diagram of the control unit 



28.4.3. f Determine AdvDot 



The HCU does not always require contone planes, bi-Ievci or tag planes in order to produce a page. For 
example, a given page may not have a bi-level layer, or a tag layer. In addition, the contone and bi-Icvel 
parts of a page are only required within the contone and bi-level page margins, and the tag part of a page is 
only required within the tag page maigins. Thus output dots can be generated without contone, bi-level or 
tag data before the respective top maxgins of a page has been reached, and Os are generated for all color 
planes after the end of the page has been reached (to allow later stages of the printing pipeline to fiush). 

Consequently the HCU has an AvailMask register that determines which of the various input avail flags 
should be taken notice of during the production of a page from the first line of the target page, and a 
TMMask register that has the same behaviour, but is used in the lines before the target page has been 
reached (i.e. inside the target top margin area). Each bit in the AvailMask refers to a particular avail bit: if 
the bit in the AvailMask register is set, then the corresponding avail bit must be 1 for the HCU to advance 
a dot. The bit to avail correspondence is shown in Table 144. Care should be taken with TMMask - if the 
particular data is not available after the top margin has been reached, then the HCU will stall Note that the 
avail bits for contone and spot colors are ANDed vdth injtarget _j>age after the target page area has been 
reached to allow dot pn>duction in the contone/spot margin areas without needing any data in the CFU and 
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SFU. The avail bit for tag color is ANDed with in^tag^target^age after the target tag page area has been 
reached to allow dot production in the tag margin areas without needing any data in the TFU. 



Table 144. Correspondence between bit In AvailMask and avail flag 









1^ 


0 


dm_avaU 


dither matrix data available 


1 


cp_avall 


contone pixels available 


2 


s_avail 


spot color available 


3 


tp.avafi 


tag plane available 



Each of the input avail bits is processed with its appropriate mask bit and the after _top_margin flag. The 
output bits are ANDed together along with Go and okjLo_write (which specifies whether the output buffer 
is ready to receive a dot in this cycle) to form the output bit advdot. We also generate wr_jadvdot. In this 
way, if the output buffer is full or any of the specified avail flags is clear, the HCU will stall. When the end 
of the page is reached, in_page will be deasserted and the HCU will continue to produce 0 for all dots as 
I long as the DNC requests data. A block diagram of the determine advdot unit is shown in Figure 197. 

The ok_to_read signal from the output buffer indicates that the HCU has a dot available for the DNC to 
read (indicated to the DNC by the assertion of hcujdnc_avail). If the DNC is ready to receive the dot 
(dncjicu^ready is 1) then the dot is read firom the output buffer by asserting ni_advdot. 
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irutargeLpage 

trrumaskfO] 
avalLmask^O] 
dmAvaB 

after jtop_margin 

tm_maslc[1] 
avBiLmasK{i} 
{rutaxgetj>age 
cp.avall 

s.awall 

trn.nia5k(2] 
8VBil_masl((2] 

«fterjlag_top_nnargln 

tm_fnask(3j 

avail_fnask(3] 
lh_tagL.taigeLpage 
fp.Mfl 

• ifU»flo 



Go 

olOo_read 
dncjKu^ady 




^ advdot 



wr.advdot 



jwi_dnc_avaiJ 
id_advdot 



28,4.3.2 Position unit 



Figure 197. Block diagram of detemiine advdot unit 



The position unit is responsible for outputting the position of the current dot (curr _j>os, currjine) and 
whether or not this dot is the last dot of a line (advline). Both curr ^os and currjtine are set to 0 at reset or 
when Go transitions from 0 to 1 . The position unit relies on the euivdot input signal to advance through the 
dots on a page. Whenever an advdot pulse is received, curr^os gets incremented. If curr jjos equals 
max^dot then an €idviine pulse is generated as this is the last dot in a line, currjine gets incremented, and 
the currjpos is reset to 0 to start counting the dots for the next line* 



28.4.3.3 Margin unit 



The responsibility of the margin unit is to determine whether the specific dot coordinate is within the page 
at all, within the target page or in a margin area (see Figure 198). This unit is instantiated for both the con- 
tone/spot margin unit and the tag maigin tmit. 
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target top margin 


1 












c 


c 

e 




a 


i 






i 




i 






I 


f 




s 




ta/get bottom margin 





^ta/get page 

^ prtntable page area 
(physical page) 



Figure 198. Page structure 

The maigin unit takes the cunrent dot and line position, and returns three flags. 

• the first, in^age is I if the current dot is within the page, and 0 if it is outside the page. 

• the second flag, injtarget^age, is 1 if the dot coordinate is within the target page area of the page, and 
0 if it is within the target top/lcfl/bottom/right margins, 

• the tfiird flag, after_top_maigin, is 1 if the current dot is below the target top margin, and 0 if it is 
within the target top margin. 

A block diagram of the margin unit is shown in Figure 199. 



curr^Dne 



curr_pos 



top.fiiargln 



botfofn_/nafgin' 



page_margln_y 




right^margin 



ieft^margln 



in^ge ln.targeU>age after^top^margin 
Figure 199. Block diagram of margin unit 
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28.4.3.4 Dither matrix tabie interface 

The dither matrix table interface provides the interface to DRAM for the generation of dither cell values 
that are used in the halftoning process in the contone dotgen unit. The control flag dm^readj&nable 
enables the reading of the dither matrix table line structure from DRAM. If dm_read_enable is 0, the 
dither matrix is not specified in DRAM and no DRAM accesses are attempted. The dither matrix table 
interface has an output flag dm^avai! v/hich specifies if the current line of the specified matrix is available. 
The HCU can be directed to stall when dm_avail is 0 by setting the appropriate bit in the HCU's Avail- 
Mask or TMMask registers. When dm_avail is 0 the value in the DititerConstant register is used as Ac 
dither cell values that are output to the contone dotgen unit 

The dither matrix table interface consists of a state machine that interfaces to the DRAM interface, a dither 
matrix buffer that provides dither matrix values, and a unit to generate the addresses for reading the buffer. 
Figure 200 shows a block diagram of the dither matrix table interface. 



advftne 

advdo! 
dm.iniLlnd8K.e[a^| 

dmJwr_inda)^^c{0-3J 

OoubieUnaBuf 




8tart.dm_adr 
end.dh^adr 
Sno,.^icfement 

dn\.read.eruibie 



dfther_cooslam 



cpO.dlth^jval cp1.dithef_val cp2_dithecval cp3_dlther_val 



Figure 200. Block diagram of dither matrix table Interface 

28.4.3.4.1 Dither matrix buffer 

The state machine loads dither matrix table data a line at a time from DRAM and stores it in a buffer. A 
single line of the dither matrix is cither 256 or 128 8-bit entries, depending on the programmable bit Dou- 
bleLincBuf. If this bit is enabled, a double-buffer mechanism is employed such that while one buffer is 
read from for the current line's dither matrix data (8 bits representing a single dither matrix entry), the 
other buffer is being written to with the next line's dither matrix data (64-bits at a time). Alternatively, the 
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single buffer scheme can be used, where the data must be loaded at the end of the line» thus incurring a 
delay. 

The single/double buffer is implemented using a 256 byte 3-port register array, two reads, one write port, 
with the reads clocked at double the system clock rate (320MHz) allowing 4 reads per clock cycle. 

The dither matrix buffer unit also provides the mechanism for keeping track of the current read and write 
buffers, and providing the mechanism such that a buffer cannot be read from until it has been written to. In 
this case, each buffer is a line of the dither matrix, i.e. 256 or 128 bytes. 

A bit is kept for the status of each dither matrix Ime buffer: huff_avaU[0] and buffiflvailfl]. It also keeps a 
single bit (rdjbuff) for the current buffer that reads are to occur from, and a single bit (wrjmff) for the cur- 
rent buffer that writes are to occur to. The output value dm_a\;ail equals buff_a\fail[rdjDuff], The output 
value ok_toj^nte equals huffjxvaillwrJmSJ. Note that when using a single line buffer, buj}i,avail[l] is 
not used 

The read addresses are byte aligned A single dither matrix entry is represented by 8 bits and an entry is 
read for each of the four contone planes in parallel. When a advline pulse is received, bi^^avail[rdjmff] 
is cleared and rdjbuff is inverted (if using a double line buffer). 

Data is written, 64 bits at a time to the current write bufifer when diujicu_rvalid is asserted When WrAdr 
is 0x1 F and diujicu^rvalid is 1, buff_Qvaxl[wrJmff] is set, and "wrjmff 'xs inverted (if using a double line 
buffer). This indicates that a line of dither matrix has been written to the current write buffer and it is now 
available to be read 

28.4.3.4.2 Read address generator 

For each contone plane there is a initial, lower and upper index to be used when reading dither cell values 
from the dither matrix double buffer. The read address for each plane is used to select a byte finom the cur- 
rent 256-byte read buffer. When Go gets set (0 to 1 transition), or at the end of a line, the read addresses 
are set to their corresponding initial index. Otherwise, the read address generator relies on advdot to 
advance the addresses within the inclusive range specified the lower and upper indices, represented by the 
following pseudocode: 

i£ (advdot «« 1) then 

if (advline 1) then 

rdLadr » dn^lnit.index 
elaif (r4.adr =s dBL.Upr_index) then 

rd.adr = din_l*nr-.index 
else 

rd^adr ♦+ 

el£se 

rd_adr - rd_odr 

28.4.3.4.3 State machine 

The dither matrix is read from DRAM in single 256-bit accesses, receiving the data from the DIU over 4 
clock cycles (64-bits per cycle),The protocol and timing for read accesses to DRAM is described in sec- 
tion 20.9.1 on page 208. Read accesses to DRAM are implemented by means of the state machine 
described in Figure 20 1 . 

All counters and flags should be cleared after reset or when Go transitions from 0 to L While the Go bit is 
1 , the state machine relies on the dm_read_enable bit to tell it whether to attempt to read dither matrix data 
from DRAM. When dm_read_enable is clear, the state machine does nothing and remains in the idle state. 
When dm^read_enabie is set, the state machine continues to load dither matrix data» 256-bits at a time 
(received over 4 clock cycles. 64 bits per cycle), while there is space available in the dither matrix buffer. 
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The read address and line^tart_adr are initially set to startjimjxdr. The read address gets inciemented 
after each read access. It takes 4 or 8 read accesses to load a line of dither matrix into the dither matrix 
buffer, depending on whether we're using a single or double buffer. A count is kept of the accesses to 
DRAM. When a read access completes and accessjcount equals 3 or 7, a line of dither matrix has just 
been loaded from and the read address is updated Xo line_5tart_adr plus linejtncrement so it points to the 
start of the next line of dither matrix. Qine^tart^adr is also updated to this value). If the read address 
equals end_dm_adr then the next read address will be start_dm_adr^ thus the read address wraps to point 
to the start of the area in DRAM where the dither matrix is stored. 

The write address for die dither matrix buffer is implemented by means of a modulo-32 counter that is ini- 
tially set to 0 and incremented when diu_hcu_rvalid is asserted. 
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RflsetORprst n«gO 
hcu.dlu_freq « 0 
hcu.dlu_radf « 0 
access.ooum ts o 

reset ^ 



hctr_<Jlu_freq s 



hcu_dlu.fadrwhcu_<llu radr 
access.count » accessjcount 
wr_adr « wr^adr 



< 



idle 



dhj hcti rvarid t Absn 
. access counting 3/7 AND 

hcu tffu ratir!«»flnd tfm adr 



hcu_dlu_freq « o 
hcicdfu,»dr = ine_start_adr ♦ 
llne_fncrement 
&«LStert.adr • lina.staJXja* + 
Gne.lncremem 
aocdss_count a 0 
wr_adr ♦+ 



r 



3 



dm read anflb>a = 1 
hcu_dlu_rmq »» 0 
hcu.diu_radfB8tajt.dm adr 
Una.etart^dr « atart.dnr.adr 
access.count e o 
wr adr = 0 



req 



c 



> 



Qk tn wrftw 



hcu.dlu_rreq e i 
hcu_.d]u_radr » O 
access.count = access.count 
wr.adr = *w_adr 



ack 



diu hcu rvaM' 
access count N 



1 ANP 

3/7 AND 



c 



dhj hcu rarfa e= 1 
hcu_dlu_jTOq B O 
hcu.dfu_radr « hcu^dlu radr 
acc8ss.count > acoe8s.ooum 
wr_adr • wr^adr 



readl 



hcu diu radr 1° end dm ,acte 
hcu.dfu.rreq « 0 
hcu_dlu.radr « hcujdiu.mdr ♦ t 
accessjcount » 



C 



3 



diu hcu fvalld ^ 
hcu_diu_rroq « 0 
hcu.tfiujadr « hcu.dtu.radr 
access.count e aooess.oount 
wr.adr ++ 



rcadZ 



c 



dlu hcu rvattd e= i 
hou.dtu.rreq a O 
hcu_<fiu.radf « hcu_diu_fadr 
access.count « access.count 
wr.adr++ 



read3 



3 



diu hcu rvalM x= i 
hcu.diu.noq b o 
hcu^dhj.radr « hcuu<flu_radr 
access.count » aooess.oount 
wr.adr++ 



read4 



> 



diu hcu rvaMt 



hcu d?u fadr = 



dm adr 



hcu.diu.rreq « 0 
hcu.dlu.radr » start^dm.adr 
IIn9..6tart,adr = 6tart.dm^dr 
acoesa^oount » 0 
wr.adr 



Rgure 201. State machine to read dither matrix table 
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28.4.4 Contone dotgen unit 



The coatone dotgen unit is responsible for producing a dot in up to 4 color planes per cycle. The contone 
dotgen unit also produces a {^javail flag which specifies whether or not contone pixels are currently avail- 
able, and the output hcu^cJu_advdot to request the CFU to provide the next contone pixel in up to 4 color 
planes. 

The block diagram for the contone dotgen unit is shown in Figure 202. 



3 



g: 



hcu_cfu_advdot 



cftj,hcu_oOdata 



cfu_hcu_ctdata 



cfu_hcu_c2data 



cfu^hcu^cSdata 



cfu_hcu.avail 



contone dotgen unit 
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dither unit 1 
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^ ► 
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32 



' advdot 
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Gp(0-3]_00Rstant 
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Figure 202. Contone dotgen unit 

A dither unit provides the functionality for dithering a single contone plane. The contone image is only 
defined within the contone/spot margin area. As a result, if the input flag injtarget_page is 0, then a con- 
stant contone pixel value is used for the pixel instead of the contone plane. 

The resultant contone pixel is then halftoned. The dither value to be used in the halftoning process is pro- 
vided by the control data unit The haJfloning process involves a comparison between a pixel value and its 
corresponding dither value. If the 8-bit contone value is greater than or equal to the 8-bit dither matrix 
value a 1 is output. If not, then a 0 is output This means each entry in the dither matrix is in the range 1- 
255 (0 is not used). 
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28.4.5 Spot dotgen unit 

The spot dotgen unit is responsible for producing a dot of bi-level data per cycle. It deals with bi-level data 
(and therefore does not need to halftone) that comes from the LBD via the SFU. Like the contone laycr» 
the bi-level spot layer is only defined within the contone/spot margin area. As a result, if input flag 
in^target^ge is 0, then a constant dot value (typically this wotdd be 0) is used for the output dot. 

The spot dotgen unit also produces a s_avail flag which specifies whether or not spot dots are currently 
available for this spot plane, and the output hcu^Ju_€xdvdot to request the SFU to provide the next bi-level 
data value. The spot dotgen unit can be represented by the following pseudocode: 

s.avail « sfu_hcu_avail 

if <in_t«rget_page == 1 AND advdot: == 1) then 

hcu_s£u_advdot «s i 
else 

hcu_sf\i_advdoC » 0 

if ( in^targeti^page i) then 

sp B sfu^hcu^sdata 
else 

sp s sp.conacant 

Tag dotgen unit 

This imit is very similar to the spot dotgen unit (see Section 28.4.5) in that it deals with bi-lcvcl data, in 
this case from the TE via the TFU. The tag layer is only defined within the tag margin area. As a result, if 
input flag injtagjtarget^age is 0, then a constant dot value, tp^constant (typically this would be 0), is 
used for the output dot The tagplane dotgen unit also produces a tp_awul flag which specifies whether or 
not tag dots are cuziently available for the tagplane, and the output hcujtfu^advdot to request the TFU to 
provide the next bi-level data value. 

Dot reorg unit 

The dot leorg unit provides a means of mapping the bi-level dithered data, the spotO color, and the tag data 
to ou^ut inks in the actual printhead. Each dot reoig unit takes a set of 6 I -bit inputs and produces a single 
bit output that represents the output dot for that color plane. 

The output bit is a logical combination of any or all of the input bits. This allows the spot color to be 
placed in any output color plane (including infrared for testing purposes), black to be merged into cyan, 
magenta and yellow (in the case of no black ink in the Memjet printhead), and tag dot data to be placed in 
a visible plane. An output for fixative can readily be generated by simply combining desired input bits. 

The dot reorg unit contains a 64'bit lookup to allow complete freedom with regards to mapping. Since all 
possible combinations of input bits are accounted for in the 64 bit lookup, a givra dot zeocg unit can take 
the mapping of other reorg units into account. For example, a black plane reorg unit may produce a 1 only 
if the contone plane 3 or spot color inputs are set (this effectively composites black bi-level over the con- 
tone). A fixative reorg unit may generate a 1 if any 2 of the output color planes is set (taking into account 
the mappings produced by the odier reorg units). 

If dead nozzle replacement is to be used (sec section 29.4.2 on page 448), the dot reorg can be pro- 
grammed to direct the dots of the specified color into the main plane, and 0 into the other. If a nozzle is 
then marked as dead in the DNC, swapping the bits between the planes will result in 0 in the dead nozzle, 
and the required data in the other plane. 

If dead nozzle replacement is to be used, and there are no tags, the TE can be programmed with the posi- 
tion of dead nozzles and the resultant pattern used to direct dots into the specified nozzle row. If only fixed 
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background TFS is to be used, a limited number of nozzles can be replaced. If variable tag data is to be 
used to specify dead nozzles, then large numbers of dead nozzles can be readily compensated for. 

The dot reoig unit can be used to average out the nozzle usage when two rov^^ of nozzles share the same 
ink and tag encoding is not being used The TE can be programmed to produce a regular pattern (e.g. 0101 
on one line, and 1 0 1 0 on the next) and this pattern can be used as a directive as to direct dots into the spec- 
ified nozzle row. 

Each reorg unit contains a 64-bit lOMapping value programmable as two 32-bit HCU registers, and a set 
of selection logic based on the 6-bit dot input (2^ «= 64 bits), as shown in Figure 203. 

input dot 




Figuro 203. Block diagram of dot raorg unit 

The mapping of input bits to each of the 6 selection bits is as defined in Table 145. 
Table 145. Mapping of input bits to 6 selection bits 









0 


bi-levei dot from corttone layer 0 


cyan 


1 


bt-level dot from contone layer 1 


magenta 


2 


tM-level dot from contone layer 2 


yetlow 


3 


bMevef dot from contone layer 3 


black 


4 


bMevel spotO dot 


black 


5 


t>i-level tag dot 


infra-red 
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29 Dead Nozzle Compensator (DNC) 

29.1 Overview 

The Dead Nozzle Compensator (DNC) is responsible for adj\isting Memjet dot data to take account of 
non-ftinctioning nozzles in the Memjet printhead. Input dot data is supplied from the HCU, and the cor- 
rected dot data is passed out to the DWU. The high level data path is shown by the block diagram in Figure 
204. 



ORAM 



Dead Nozzle 
Data 







y 






HCU 


raw dot ^ 
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compensated^ 


DWU 


daia ^ 


dot ► 



Figure 204. High level block diagram of DNC 



The DNC compensates for a dead nozzles by perfoiming the following operations: 

• Dead nozzle removal, i.e. turn the nozzle off 

• Ink replacement by direct substitution ie. K -> K 

• Ink replacement by indirect substitution i.e. K -> CMY 

• Error difiEusion to adjacent nozzles 

• Fixative corrections 

The DNC is required to efficiently support up to 5% dead nozzles, under the expected DRAM bandwidth 
allocation, with no restriction on where dead nozzles are located and handle any fixative correction due to 
nozzle con^)en5atioiis. Performance must degrade gracefully after 5% dead nozzles. 

29.2 Dead nozzle identification 

Dead nozzles are identified by means of a position value and a mask value. Position information is repre- 
sented by a 10-bit delta encoded format, where the 10-bit value defines the number of dots between dead 
nozzle columns'. With the delta information it also reads the 6-bit dead nozzle mask (dn_mask) for the 
defined dead nozzle position. Each bit in the dn^mask corresponds to an ink plane. A set bit indicates that 
the nozzle for the corresponding ink plane is dead. The dead nozzle table format is shown in Figure 205. 
The DNC reads dead nozzle information from DRAM in single 256-bit accesses. A 10-bit delta encoding 
scheme is chosen so that each table entry is 16 bits wide, and 16 entries fit exactly in each 256-bit read. 
Using 10-bit delta encoding means that the maximum distance between dead nozzle columns is 1023 dots. 
It is possible that dead nozzles may be spaced further than 1023 dots from each other, so a null dead nozzle 
identifier is required. A null dead nozzle identifier is defined as a 6-bit dn^mask of all zeros. These null 
dead nozzle identifiers should also be used so that: 

• the dead nozzle table is a multiple of 1 6 entries (so that it is aligned to the 256-bit DRAM locations) 



I. for a 10-bit delta value of rf, if the current column n is a dead nozzle column then the next dead nozzle column is given by n + (i/ + 1). 
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• the dead nozzle table spans the complete length of the line, i.e. the first entry dead nozzle table should 
have a delta from the first nozzle column in a line and the last entry in the dead nozzle table should cor- 
respond to the last nozzle column in a line. 

Note that the DNC deals with the width of a page. This may or may not be the same as the width of the 
printhead (the PHI may introduce some margining to the page so that its dot output matches the width of 
the printhead). Care must be taken when programming the dead nozzle table so that dead nozzle positions 
are correctly specified with respect to the page and printhead. 



16 bits wide 



N dead nozzle 
columns 




Table Entry Structure 

I 6-bit DnMask~j 



lOOn Delta Encode 



bits 15-6 



bits 5-0 



Figure 205. Dead nozzle table format 



29.3 DRAM storage and bandwidth REQUrREMENT 

The memoiy required is largely a factor of the number of dead nozzles present in the printhead (which in 
turn is a fector of the printhead size). The DNC is required to read a 1 6-bit entry from the dead nozzle table 
for every dead nozzle. Table 146 shows the DRAM storage and average* bandwidth requirements for the 
DNC for different percentages of dead nozzles and different page sizes. 

Table 146. Dead Nozzle storage and average bandwidth requirements 







Memory 
(KBytes) 


Bandwidth 
(btts/bycle) 




5% 




0.8<* 


10% 


2.7 


1.6 


15% 


4,1 


2.4 


A3«» 


5% 


1.9 


0.6 


10% 


3.8 


1.6 


15% 


5.7 


2.4 



a. Bt-Iithic printhead has 13824 nozzles per color providing lull bleed printing for A4/Letter 

b. Bi-lithic printhead has 194S8 nozzles per color providing full bleed printing for A3 



I . Average bandwidth assumes an even spread of dead nozzles. Clumps of dead nozzles may cause delays due to insufficient available 
DRAM bandwidth. These delays will occur every line causing an accumulative delay over a page. 
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c. 16 bits X 13824 nozzles x O.OS dead 

d. (16 bits read / 20 cycles) = 0.8 bits/cycle 

29.4 Nozzle compensation 

DNC receives 6 bits of dot infoimation every cycle from the HCU, 1 bit per color plane. When the dot 
position corresponds to a dead nozzle column, the associated 6-bit dn_mask indicates which ink piane(s) 
contains a dead nozzle(s). The DNC first deletes dots destined for the dead nozzle. It then replaces Aose 
dead dots, either by placing the data destined for the dead nozzle into an adjacent ink plane (direct substi- 
tution) or into a number of ink planes (indirect substitution). After ink replacement, if a dead nozzle is 
made active again then the DNC performs error difiusion. Finally, following the dead nozzle compensa- 
tion mechanisms the fixative, if present, may need to be adjusted due to new nozzles being activated, or 
dead nozzles being removed. 

29.4.1 Dead nozzle removal 

If a nozzle is defined as dead, then the first action for the DNC is to turn off (zeroing) the dot data destined 
for that nozzle. This is done by a bit-wise ANDing of the inverse of the dn^mask with the dot value. 

29.4.2 Ink replacement 

Ink replacement is a mechanism where data destined for the dead nozzle is placed into an adjacent ink 
plane of the same color (direct substitution, i.e. K -> K^temativc). or Placed into a number of ink planes, the 
combination of \^ch produces the desired color (indirect substitution, i.e. K -> CMY). Ink replacement is 
performed by filtering out ink belonging to nozzles that are dead and then adding back in an appropriately 
calculated pattern. This two step process allows the optional re-inclusion of the ink data into the original 
dead nozzle position to be subsequently error diffused. In the general case, fixative data destined for a dead 
nozzle should not be left active intending it to be later diffused. 

The ink replacement mechanism has 6 ink replacement patterns, one per ink plane, programmable by the 
CPU. The dead nozzle mask is ANDed with the dot data to sec if there are any planes where the dot is 
active but the corresponding nozzle is dead. The resultant value forms an enable, on a per ink basis, for the 
ink replacement process. If replacement is enabled for a particular ink, the values from the corresponding 
replacement pattern register are ORed into the dot data. The output of the ink replacement process is then 
filtered so that error diffusion is only allowed for the planes in which error diffusion is enabled. The output 
of the ink replacement logic is ORed with the resultant dot after dead nozzle removal. Sec Figure 210 on 
page 459 for implementation details. 

For example if we consider the printhead color configuration C.M,Y,K,,K2,IR and the input dot data fi-om 
the HCU is blOl 100. Assuming that the K, ink plane and IR ink plane for this position are dead so the 
dead nozzle mask is bOOOlOl. The DNC first removes the dead nozzle by zeroing the K, plane to produce 
b 101 000. Then the dead nozzle mask is ANDed with the dot data to give bOOOlOO which selects the ink 
replacement pattern for Kj (in this case the ink replacement pattern for Ki is configured as bOOOOlO, i.e. 
ink replacement into the plane). Providing error diffusion for K2 is enabled, the output from the ink 
replacement process is bOOOOlO. This is ORed with the output of dead nozzle removal to produce the 
resultant dot b 1 0 1 0 1 0. As can be seen the dot data in the defective K, nozzle was removed and replaced by 
a dot in the adjacent K2 nozzle in the same dot position, i.e. direct substitution. 

In the example above the Ki ink plane could be compensated for by indirect substitution, in which case ink 
replacement pattern for Ki would be configured as bl 1 1000 (substitution into die CMY color planes), and 
this is ORed with the output of dead nozzle removal to produce the resultant dot b 1 1 1000. Here the dot 
data in the defective ink plane was removed and placed into the CMY ink planes. 
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29.4.3 Error diffusion 

Based on the programming of the lookup table the dead nozzle may be left active after ink replacement. In 
such cases the DNC can compensate using error diffusion. Error diffusion is a mechanism where dead noz- 
zle dot data is diffused to adjacent dots. 

When a dot is active and its destined nozzle is dead, the DNC will attempt to place the data into an adja- 
cent dot position, if one is inactive. If both dots are inactive then the choice is aibitxaiy, and is determined 
by a pseudo random bit generator. If both neighbor dots are already active then the bit cannot be compen- 
sated by diffiision. 

Since the DNC needs to look at neighboring dots to determine where to place the new bit (if required), the 
DNC works on a set of 3 dots at a time. For any jgiven set of 3 dots, Hie first dot received from the HCU is 
referred to as dot A, and the second as dot B, and the third as dot C. The relationship is shown in Figure 



0-1 



dot A 



dotB 



dote 



direction of dot movement 



Figure 206. Set of dots operated on for error diffusion 



For any given set of dots ABC, only B can be compensated for by error diffusion if B is defined as dead, A 
1 in dot B will be diffused into either dot A or dot C if possible. If there is already a 1 in dot A or dot C 
then a 1 in dot B cannot be diffused into that dot. 

The DNC must support adjacent dead nozzles. Thus if dot A is defined as dead and has previously been 
compensated for by error diffusion, then the dot data fi^m dot B should not be diffused into dot A Simi- 
larly» if dot C is defined as dead, then dot data from dot B shoiild not be diffused into dot C. 

Error diffusion should not cross line boundaries. If dot B contains a dead nozzle and is the first dot in a line 
then dot A represents the last dot from the previous line. In this case an active bit on a dead nozzle of dot B 
should not be diffused into dot A. Similarly, if dot B contains a dead nozzle and is the last dot in a line then 
dot C represents the first dot of the next line. In this case an active bit on a dead nozzle of dot B should not 
be diffused into dot C. 

Thus, as a rule, a 1 in dot B cannot be diffused into dot A if 

• a 1 is already present in dot A, 

• dot A is defined as dead, * 

• or dot A is the last dot in a line. 

Similarly, a I in dot B cannot be diffused into dot C if 

• a 1 is already present in dot C, 

• dot C is defined as dead, 

• or dot C is the first dot in a line. 

If B is defined to be dead and the dot value for B is 0, then no compensation needs to be done and dots A 
and C do not need to be changed. 

If B is defined to be dead and the dot value for B is 1, then B is changed to 0 and the DNC attempts to 
place the 1 from B into either A or C: 

• Jf the dot can be placed into both A and C, then the DNC must choose between them. The preference is 
given by the current output from the random bit generator, 0 for **prefer left" (dot A) or 1 for ''prefer 
right" (dot C). 
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• If dot can be placed into only one of A and C» then the 1 from B is placed into that position. 

• If dot cannot be placed into cither one of A or C, then the DNC cannot place the dot in either position. 
Table 147 shows the truth table for DNC error difiusion operation when dot B is defined as dead. 

Table 147. Error Diffusion Truth Table when dot B Is dead 



' ' ~~ "~ wm 
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Ainput 
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C input 
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Ainput 
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C input 
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C input 
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C input 
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C input 
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C input 
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Ainput 
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X 


Ainput 


0 


C input 



a. Output from random bit generator. Determines direction of error diffusion (0 = left, 1 = right) 

b. Bold emphasis is used to show the DNC inserted a 1 

The random bit value used to aibitrarily select the direction of diffusion is generated by a 32>bit maximum 
length random bit generator. The generator generates a new bit for each dot in a line regardless of whether 
the dot is dead or not. The random bit gen«:ator can be initialized with a 32>bit programmable seed value. 



29.4.4 Fixative correction 

After the dead nozzle compensation methods have been applied to the dot data, the fixative, if present, may 
need to be adjusted due to new nozzles being activated, or dead nozzles being removed For each output 
dot the DNC determines if fixative is required (using the FixativeRequiredMask register) for the new com- 
pensated dot data word and whether fixative is activated already for that dot For the DNC to do so it needs 
to know the color plane that has fixative, this is specified by the FixativeMaskl configuration register. 
Table 148 indicates the actions to take based on these calculations. 



Table 148. Truth table for fixative correction 









1 


1 


Output dot as is. 


1 


0 


Clear fixative plane. 


0 


1 


Attempt to add fixative. 


0 


0 


Output dot as Is. 



The DNC also allows the specification of another fixative plane, specified by the Fixativ€h4ask2 configura- 
tion register, with FixativeMaskl having the higher priority over FixativeMask2. When attempting to add 
fixative the DNC first tries to add it into the planes defined by FixativeMaskl. However, if any of these 
planes is dead then it tries to add fixative by placing it into the planes defined by FixativeMask2, 
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Note that the fixative defined by FixativeMaskl and FixativeMask2 could possibly be multi^part fixative, 
i.e. 2 bits could be set in FixativeMaskl with the fixative being a combination of both inks. 
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29.5 Implementation 

A block diagram of the DNC is shown in Figure 207. 
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Ffgure 207. Block diagram of DNC 
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29.5.1 Definitions of I/O 



Table 149. DNC port IJst and description 









immmiSMimMsmdi m 


Clocks and Resets 


pdk 


1 


In 


System Clock. 


prst^n 


1 


In 


System reset, synchronous active \ow. 


PCU intertace 


pcu_dnc_sel 


1 


In 


Bk)ck select from the PCU. When pcu^d/KLsa/is high both 
pctcadirand pcu^dataout are vaRd. 


pcu_fwn 


1 


In 


Common read/not-write signal from the PCU. 


pcu_adr(6:2] 


5 


In 


PCU address bus. Only 5 bits are required to decode the 
address space for this block. 


pcu_dataou1(31:0] 


32 


In 


Shared write data bits from the PCU. 


dnc _pcu_rdy 


1 


Out 


Ready signal to the PCU. When dnc_pcu_fdy is high It indi- 
cates the last cyde of the access. For a write cyde this 
means pcuLdataouf has been registered by the block and for 
a read cyde this means the data on (^ojxu^data is valid. 


dnc..pcu_data[31 :0] 


32 


Out 


Read data bus to the PCU. . 


DIU interface 


dnc_diu_rreq 


1 


Out 


DNC unit requests DRAM read. A read request must be 
accompanied by a valki read address. 


dnc^diu_radr(21 :5j 


17 


Out 


Read address to DIU. 256-btt %vord aligned. 


diu_dnc_fack 


1 


In 


Acknowledge from DIU that read request has been accepted 
and new read address can be placed on iSnc^diujradr 


diu_dnc_rvaUd 


1 


tn 


Read data valid, active hi^. Indicates that vafid read data is 
now on the read data Imis, diu_data. 


diu_data{63:0] 


64 


In 


Read data from DIU. 


HCU Interface 


dnc_hcu_ready 


1 


Out 


Indicates that DNC Is ready to accept data from the HCU. 


hcu^dnc^avall 


1 


In 


Indfoates vaDd data present on hcu_dhc_data. 


hcu_dnc_data[5:0J 


6 


In 


Output bi4evel dot data in 6 ink planes. 


DWU interface 


dwujdnc^ready 


1 


In 


Indicates that DWU Is ready to accept data from the DNC. 


dnc_dwu_avail 


1 


Out 


Indfoates valid data present on dnc^dwu^daUL 


dnc_dwu_data{5:0] 


6 


Out 


Output bl-level dot data in 6 ink planes. 



29.5.2 Configuration registers 

The configuration registers in the DNC arc programmed via the PCU interface. Refer to section 21,8.2 on 
page 257 for the description of the protocol and timing diagrams for reading and writing registers in the 
DNC. Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register reads 
and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the 
DNC. When reading a register that is less than 32 bits wide zeros should be cetumed on the i^per unused 
bit(s) of dnc^pcu^data. Table 150 lists the configuration registeis in the DNC. 
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Table 150. DNC configuration leglsters 













Control registers 


0x00 


Reset 


1 


0x1 


A write to this register causes a reset of the 
DNC. 


0x04 


Go 


1 


0x0 


Writing 1 to this register starts the DNC. Writing 
0 to this register halts the DNC. 
When Go is asserted all counters, flags etc. are 
cleared or given their Initial value, but configura- 
tion registers keep their values. 
When Go is deasserted the state-machines go 
to their idle states but all counters and configu- 
ration registers keep their values. 
This register can be read to determine If the 
DNC is runrting 
(1 running, 0 s stopped). 


Setup registers ( 


constant during processing) 


0x10 


Maxf>ot 


16 


0x0000 


This is the maximum dot number - 1 present 
across a page. For example if a page contains 
13824 dots, then AfaxOof will be 13823. 
Note that this numtyer may or may not be the 
same as the number of dots across the print- 
head as some margining may be introduced In 
the PHI. 


0x14 


LSFR 


32 


0X0000. 
0000 


The current value of the LFSR register used as 
the 32-blt maximum length random bit genera- 
tor. 

Users can write to this register to program a 
seed value for the 32-bit masdmum length ran- 
dom bit generator. Must not be allls for taps 
implemented in XNOR form. (It is expected that 
writing a seed value will not occur during the 
operation of the LFSR). 

This LSFR value coukJ also have a possible use 
as a random source in pregram code. 


0x20 


RxativeMaskI 


6 


0x00 


Defines the higher prfority fixative ptane(s). Bit 0 

represents the settings for plane 0, bit 1 for 

plane 1 etc. For each bit 

1 = the ink plane contains fixative. 

0 = the ink plane does not contain fixative. 


0X24 


RxatfveMask2 


6 


0x00 


Defines the fower priority fixative plane(s). Sit 0 

represents the settings for plane 0. bit 1 for 

plane 1 etc. Used only when RxaUveMa^l 

planes are dead. For each bit 

1 the ink plane contains fixative. 

0 s the ink plane does not contain fixative. 


0x28 


RxativeRequiredMask 


6 


0x00 


Identifies the ink planes that require fixative. Bit 

0 represents the settings for plane 0. bit 1 for 
plane 1 etc. For each t>lt: 

1 B the Ink plane requires fixative. 

0 = the ink plane does not require fixative (e.g. 
Ink is self-fbdng) 
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Table 150. DNC configuration registers 





m 






0x30 


DnTableStartAdr 


17 


0x0.0000 


Start address of Dead Nozzle Table in DRAM, 
specified in 256'blt words. 


0x34 


DnTableEndAdr 


17 


0x0.0000 


End address of Dead Nozzle Table in DRAM, 
spedfted in 256^1! words, i.e. the location con- 
taining tbe last entry in the Dead Nozzle Table. 
The Dead Nozzle Table should be aligned to a 
256^lt boundary, If necessary it can be padded 
with null entries. 


0x40 - 0x54 


PlaneReplacePat- 
tern(5:0] 


6x6 


0x00 


Defines the ink replacement pattern for each of 
the 6 ink planes, PtaneReptacePattem[0] ts the 
ink replacement pattern for piane 0, PlaneRe- 
ptac6Pattem[1]is the ink replacemeni pattern 
for piane 1« etc 

For each 64)it replacement pattern for a plane, 
a 1 in any bit posltksns indicates the alternative 
ink planes to be used for this plane. 


0x58 


Diffuse Enabte 


6 


Qx3F 


Defines whether, after ink replacement, error 
diffusion is'aliowed to be performed on each 
plane. 

Bit 0 represents the settings for plane 0. bit 1 lor 
plane 1 etc. For each bit 
1 B error diffusion is enabled 
0 s error diffusion is disabled 


Debug registers (read only) 


0x60 


DncOutputDebug 


6 


N/A 


Bit 7 = dwu_jcinc_rBady 
Bit 6 8 dnc^dwu^avaii 
Bits 5-0 = dncjdmijdata 


0x64 


DncReplaceDebug 


14 


N/A 


Bit12 = flv_ava/r 

Bits 11-6 = ifu_<fn_mask 

Bits 5-0 = iru_data 


0x68 


DncDIffuseDebug 


14 


N/A 


Bit 13 B dwu^dnc^ready 
Bit 12 = dncj(imj_avsJi 
Bits 11-6 s ed^_dnJmask 
Bits 5-0 e edu^dato 
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29.5.3 Ink replacement unit 

Figure 208 shows a sub-block diagram for the ink replacement unit 
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Figure 208. Sub-block diagram of ink replacement unft 



29.5.3. f Control unit 

The control unit is responsible for reading the dead nozzle table from DRAM and making it available to 
the DNC via the dead nozzle FIFO. The dead nozzle table is read from DRAM in single 256-bit accesses, 
receiving the data from the DIU over 4 clock cycles (64*bits per cycle). The protocol and timing for read 
accesses to DRAM is described in section 20.9.1 on page 208. Reading from DRAM is implemented by 
means of the state machine shown in Figure 209. 

All counters and flags should be cleared after reset. When Go transitions from 0 to 1 all counters and flags 
should take their initial value. While the Go bit is 1, the state machine requests a read access from the dead 
nozzle table in DRAM provided there is enough space in its FIFO. 

A modulo-4 coimter, rd^count^ is used to count each of the 64>bits received in a 256-bit read access. It is 
incremented whenever diujdncjrvalid is asserted. When Go is 1, dnjable^radr is set to 
dn_table_^tart_adr. As each 64-bit value is returned, indicated by diu^dnc^rvcdid being asserted, 
dn_table_radr is compared to dn_table^end_adn 

• If rd^count equals 3 and dnj£able_radr equals dnjtable_end_adr^ then dn^table^racir is updated to 
dnjtable_fitart_jadr, 

• If rd_count equals 3 and dnjtable^radr does not equal dn_tabl€_end_fldr^ then dnjtable^radr is incre- 
mented by 1. 

A count is kept of the number of 64-bit values in the FIFO, When diu_dnc_rvalid is 1 data is written to the 
RFO by asserting wrjsn^ and fifo_contents zndfi/o_wr_adr are both incremented. 
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V/hcn fifo_contentsf3:0J is greater than 0 and edu^ready is 1, dncjicujready is asserted to indicate that 
the DNC is ready to accept dots from the HCU. If hcu_dncjavail is also 1 then a dotadv pulse is sent to the 
GenMask unit, indicating the DNC has accepted a dot from the HCU, and iru^avail is also asserted. After 
Go is set, a single preload pulse is sent to the GenMask unit once the FIFO contains data. 

When a rd^adv pulse is received from the GenMask VimX^fifo_rd_adr[4:0] is then incremented to select 
the next 16-bit value. If fifo_rd_adr [1:0] = 1 1 tficn the next 64-bit value is read from the FIFO by asserting 
rdLcn. andfifo_contenisf3:0J is decremented 



dn table nadri» dn table end adr 



dn_table_fTeq » 0 
dln_table_mar -M- 
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Figure 209. Dead nozzle table state machine 



29.5,3.2 Dead nozzle FIFO 

The dead nozzle FIFO conceptually is a 64-bit input, and 16-bit output FIFO to account for the 64*bit data 
transfers' frt)m the DIU, and ^e individual 16-bit entries in the dead nozzle table that are \ised in the Gen- 
Mask unit In reality, the FIFO is actually 8 entries deep and 64-bit$ wide (to accotnmodate two 256-bit 
accesses). 

On the DRAM side of the FIFO the write address is 64-bit aligned while on the GenMask side the read 
address is 1 6-bit aligned, i.e. the upper 3 bits are input as the read address for the FIFO and the lower 2 bits 
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are used to select 16 bits from the 64 bits (1st 16 bits read correspoads to bits 15-0, second 16 bits to bits 
31-16 etc.). 

29.5.3.3 GenMaskunit 

The GenMask unit generates the 6-bit dn_mask that is sent to the replace unit. It consists of a 10-bit delta 
counter and a mask register. 

After Go is set, the GenMask unit will receive a preload pulse from the control unit indicating the fiist 
dead nozzle table entry is available at the output of the dead nozzle FIFO and should be loaded into the 
delta counter and mask register. A rd_adv pulse is generated so that the next dead nozzle table entiy is pre- 
sented at the output of the dead nozzle FIFO. The delta counter is decremented every time a dotadv pulse 
is received. When the delta counter reaches 0, it gets loaded with the current delta value output Irom the 
dead nozzle FIFO, i.e. bits 15-6, and the mask register gets loaded with mask output from the dead nozzle 
FIFO, i.e. bits 5-0, A rdjadv pulse is then generated so that the next dead nozzle table entry is presented at 
the output of the dead nozzle FIFO. 

When the delta counter is 0 the value in the mask register is output as the dn_mask, otherwise the dn _jnask 
is all Os. 

The GenMask unit has no knowledge of the number of dots in a line, it simply loads a counter to count the 
delta from one dead nozzle column to the next. Thus as described in section 29.2 on page 446 the dead 
nozzle table should include null identifiers if necessary so that the dead nozzle table covers the first and 
last nozzle column in a line. 

29.5.3.4 Replace unit 

Dead nozzle removal and ink replacement are implemented by the combinatorial logic shown in Figure 
210. Dead nozzle removal is performed by bit-wise ANDing of the inverse of the dn^mask with the dot 
value. 

The ink replacement mechanism has 6 ink rq)lacement patterns, one per ink plane, prograixmiable by the 
CPU. The dead nozzle mask is ANDed with the dot data to see if there are any planes where the dot is 
active but the corresponding nozzle is dead. The resiiltant value forms an enable, on a per ink basis, for the 
ink replacement process. If replacement is enabled for a particular ink, the values from the corresponding 
replacement pattern register are ORed into the dot data. The output of the ink replacement process is tfien 
filtered so that error diffusion is only allowed for the planes in which error diffusion is enabled. 

The output of the ink replacement process is ORed with the resultant dot after dead nozzle removal. If the 
dot position does not contain a dead nozzle then the dnjmask will be all Os and the dot, hcu_dncjiata^ will 
be passed through unchanged. 
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Figure 210. Logic for dead nozzle removal and ink replacement 
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29.5.4 Error Diffusion Unit 

Figure 21 1 shows a sub-block diagram for the error diffusion unit. 
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Figure 211. Sub-block diagram of error diffusion unit 



29.5.4.i Random Bit Generator 



The random bit value used to arbitrarily select the direction of diffusion is generated by a maximum length 
32-bit LFSR- The tap points and feedback generation are shown in Figure 212. The LFSR generates a hew 
bit for each dot in a line regardless of whether the dot is dead or not, i.e shifting of the LFSR is enabled 
when advdot equals 1. The LFSR can be initialised with a 32-bit programmable seed value, random^seecL 
This seed value is loaded into the LFSR whenever a write occurs to the RandomSeed register. Note that the 
seed value must not be all Is as this causes the LFSR to lock*up. 



*3I 




|30|29|2<|27)26|25|24|23i22|2lj20|l9|l8|l7|l6|is|l4|l3|l2[ll|l0i 9|8|7i6|s|4|3|2|!|o 



XNOR 



output 
bit 



Figure 212. Maximum length 32-bit LFSR used for random bit generation 



29.5.4.2 Advance Dot Unit 

The advance dot unit is responsible for determining in a given cycle whether or not the error diffuse unit 
will accept a dot from the ink replacement unit or make a dot available to the fixative correct unit and on to 
the DWU. It therefore receives the dwu_dnc^re<xdy control signal from the DWU, the iru_avaii flag from 
the ink replacement unit, and generates dnc_jdwu_avail and edu_ready control flags. 

Only the dwu_dnc_ready signal needs to be checked to see if a dot can be accepted and asserts edu_ready 
to indicate this. If the error diffuse unit is ready to accept a dot and the ink replacement \mit has a dot avail- 
able, then a adsfdot pulse is given to shift the dot into the pipeline in the difiuse unit Note that since the 
error diffusion operates on 3 dots, the advance dot unit ignores dwu_dnc_ready initially until 3 dots have 
been accepted by the diffuse unit Similarly dnc^dwu^avail is not asserted until the diffuse unit contains 3 
dots and the ink replacement unit has a dot available. 
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29.5.4,3 Diffuse Unit 

The diffuse unit contains the combinatorial logic to implement the tmth table from Table 147, The difftzse 
unit receives a dot consisting of 6 color planes (1 bit per plane) as well as an associated 6-bit dead nozzle 
mask value. 

Error diffusion is applied to all 6 planes of the dot in parallel. Since error diffusion operates on 3 dots, the 
diffuse imit has a pipeline of 3 dots and their corresponding dead nozzle mask values. The first dot 
received is referred to as dot A, and tiie second as dot B» and the third as dot C. Dots are shifted along the 
pipeline whenever advdot is 1. A count is also kept of the number of dots received It is incremented when- 
ever o^/w^a/ is 1, and wraps to 0 when it reaches max_dot. When the dot count is 0 dot C corresponds to the 
first dot in a line. When the dot count is 1 dot A corresponds to the last dot in a line. 

In any given set of 3 dots only dot B can be defined as containing a dead nozzlc(s). Dead nozzles are iden- 
tified by bits set in iru_dn_mask. If dot B contains a dead nozzle(s), the corresponding bit(s) in dot A, dot 
Q the dead nozzle mask value for A. the dead nozzle mask value for C, the dot count, as well as the ran- 
dom bit value are input to the truth table logic and the dots A, B and C assigned accordingly. If dot B does 
not contain a dead nozzle then the dots are shifted along the pipeline unchanged. 

29.5.5 Fixative Correction Unit 

The fixative correction unit consists of combinatorial logic to implement fixative correction as defined in 
Table 151. For each output dot the DNC determines if fixative is reqiiired for the new compensated dot 
data word and whether fixative is activated already for that dot. 

FixacivePresenc = ( (PixativeHastcl *| FixativeMask2 ) & edu^data) != 0 
FixativeRequired = (Flxat:iveRequiredKaak & edu.data) 1= 0 

It then loolcs up the truth table to see what action, if any, needs to be taken. 



Table 1 51 . Truth table for flxatrve correction 





i 








1 


1 


Outputdotasis. 


dnc.dwu.data s edu_data 


1 


0 


Clear fixative ptane. 


dnc_dwu_data s (edu^data) & -^RxativeMaskl | FbcativeMa8k2) 


0 


1 


Attempt to add fixa- 
tive. 


If (FixatlveMaskI & OnMa6k)lsO 

dnc_dwu_data»<edu.data) ] (FbcativeMa$k2 & -DnMask) 
else 

dnc.dwu.data = (edu.data) | (RxativeMaskI) 


0 


0 


Output dot as is. 


dnc_dwu_data = edu_data 



When attempting to add fixative the DNC first tries to add jt into the plane defined by FixativeMaskL 
However, if this plane is dead then it tries to add fixative by placing it into the plane defined by 
Fixativ€Afask2, Note that if both FixativeMaskJ and FixativeMask2 are both all Os then the dot data will 
not be changed. 
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30 Dotline Writer Unit (DWU) 

30.1 Overview 

The Dotline Writer Unit (DWU) receives 1 dot (6 bits) of color information per cycle from the DNC. Dot 
data received is bundled into 256-bit words and transferred to the DRAM. The DWU (in conjunction with 
the LLU) implements a dot line FIFO mechanism to compensate for the physical placement of nozzles in a 
printhead, and provides data rate smoothing to allow for local complexities in the dot data generate pipe- 
line. 



ORAM 
viaOtU 



ONC 



dot data 



dot data 



dol data 



OWU 



control 



Hgure 213. High revel data flow diagram of DWU in context 



30.2 Physical requirement imposed by the printhead 

The physical placement of nozzles in the printhead means that in one firing sequence of all nozzles, dots 
will be produced over several print lines. The printhead consists of 12 rows of nozzles, one for each color 
of odd and even dots. Odd and even nozzles are separated by D2 print lines and nozzles of different colors 
arc sq>arated by Di print lines. See Figure 214 for reference. The first color to be printed is the first low of 
nozzles encountered by the incoming paper. In die example this is color 0 odd» although is dependent on 
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the printhead type (see Section 35 Memjet Printhead for other piinthead arrangments). Paper passes under 
printhead moving downwards. 



lype 0 printhead iC 

Color 5 Even — 
Color 5 Odd — 
Color 4 Even — 
Color40dd — 
Color 3 Even — 
Color 3 Odd — 
Color 2 Even — 
Color 2 Odd — 
• Color 1 Even — ► 
Color 1 Odd — 
Colore Even — 
Color 0 Odd — 




Q i Q Q O Q O Q O 



eOQQOQOQ 
0 0 O 0 G ® ® 
00000000 0] 

0 2 4 6 6 10 12 14 te 

000 00000^ 
00O0O000 01 



0®G@0®®0 Q — 
000000000- 
®G©®0O00 0 — 
000000000 

)00000000 
000000000 
OOQQOOOQ — 



OOOOQOO I Q 



0 0 0 0 0 O 



S>00000000- 

20 ^ 24 26 28 90 32 34 



000000000 



19 

Q 



'21 23 25 

0 0© 



27 29 31 33 35 



► |im 32 um 



Paper 



Type 1 printhead 10 



^ ^ 80 (im 
SOfim 



-Shift register Order 



5 lines 
O^sSIInes 



Paper Olrectian 



Note: Paper passes under printhead 

Figure 214. Printhead Nozzle Layout for conceptual 36 Nozzle bMithic printhead 

For example if the physical separation of each half row is SOpLm equating to D|=D2=5 print lines at 
1600dpi. This means that in one firing sequence, color 0 odd nozzles will fire on dotline L, color 0 even 
nozzles will fire on dotline L-Di, color I odd nozzles will fixe on dotline L-D1-D2 and so on over 6 color 
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planes odd and even nozzles. The total number of lines &ed over is given as O-f-S-t-5 +5= 0 + 11x5 =55. 

See Figure 2 1 5 for example diagram. 
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Rgure 215. Paper and printhead nozzles relationship (example with 0^=Dx=9) 

It is expected that the physical spacing of <the printhead nozzles will be (or 5 dot lines), althou^ 

there is no dependency on nozzle spacing. The DWU is configurable to allow other line nozzle spacings. 



Table 152. Relationship between Nozzle color/sense and line firing 





sense 


line 






Color 0 


even 


L * 


even 


L-5 


odd 


L^ 


odd 


L 


Colon 


even 


L-10 


even 


L-15 


odd 


L-15 


odd 


L-10 


Color 2 


even 


L-20 


even 


L-25 


odd 


L-25 


odd 


L-20 


Color 3 


even 


L-30 


even 


U35 


odd 


L-35 


odd 


L'30 


Color 4 


even 


L-40 


even 


L-45 


odd 


L-45 


odd 


L-40 


Colors 


even 


L-50 


even 


L-55 


odd 


L-S5 


odd 


L-50 



30.3 Line rate de-coupling 

The DWU block is required to compensate for the physical spacing between lines of nozzles. It does this 
by storing dot lines in a FIFO (in DRAM) until such time as they are required by the LLU for dot data 
transfer to the printhead interface. Colors are stored separately because they are needed at different times 
by the LLU. The dot line store must store enough lines to compensate for the physical line separation of 
the printhead but can optionally store more lines to allow system level data rate variation between the read 
(printhead feed) and write sides (dot data generation pipeline) of the FIFOs. 
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A logical representation of the FIFOs is shown in Figure 2 1 6, where N is defined as the optional number of 
extra half lines in the dot line store for data rate de-coupling. 

Even Row Encountered First 



LLU 
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Side 



Color 5. Odd RFO 



Color 5, Even RFO 



Color 4. Odd FIFO 



CoJor 4, Even FIFO 
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Color 0, Odd RFO 



1^ 



DWU 
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Side 
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Color 5, Even FIFO 



Color 4. Odd RFO 



Color 4, Even RFO 
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Color 3, Even RFO 



Color 2. Odd RFO 



Color 2. Even RFO 



Colon. Odd RFO 



Color 1. Even RFO 



Color 0. Odd FIFO 
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1^ 



i Extra line store^ ^ l^i Da^ 

Figure 216. Dot line store logical representation 



30.4 Dot line store storage requirements 

For an arbitrary page width of d dots (where d is even), the number of dots per half line is d/2. 

For interline spacing of D2 and inter-color spacing of Di, with C colors of odd and even half lines, the 
number of half line storage is (C - 1) (D2+Dj) + Dl. 

For N extra half line stores for each color odd and even, the storage is given by (N ♦ C * 2). 
The total storage requirement is ((C - 1) OOa+Dj) + Dl + (N * C ♦ 2)) * d/2 in bits. 
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Note that when determining the storage requirements for the dot line store, the number of dots per line is 
the page width and not necessarily the printhead width. The page width is often the dot margin number of 
dots less than the printhead width. They can be the same size for full bleed printing. 

For example in an A4 page a line consists of 13824 dots at 1600 <^i, or 6912 dots per half dot line. To 
store just enough dot lines to account for an inter-line nozzle spacing of 5 dot lines it would take 55 half 
dot lines for color 5 odd, 50 dot lines for color 5 even and so on, giving 55+50+45...10+5+0= 330 half dot 
lines in total. If it is assumed that N=4 then the storage required to store 4 extra half lines per color is 4 x 
12=^8, in total giving 330+48=378 half dot lines. Each half dot line is 6912 dots, at 1 bit per dot give a 
total storage requirement of 6912 dots x 378 half dot lines / 8 bits = Approx 319 Kbytes. Similarly for an 
A3 size page with 19488 dots per line, 9744 dots per half line x 378 half dot lines / 8 = Approx 899 
Kbytes. 



Table 153. Storage requirement for dot line store 



























mm 






A4 


4 


264 


223 


312 




263 




5 


330 


278 


378 


319 


A3 


4 


264 


628 


312 


742 




5 


330 


785 


378 


899 



The potential size of the dot line store makes it unfeasible to be implemented in on-chip SRAM, requiring 
the dot line store to be implemented in embedded DRAM. This allows a configurable dotline store where 
unused storage can be redistributed for use by other paxts of the system. 



30.5 Local buffering 

An embedded DRAM is expected to be of the ozder of 256 bits wide, which results in 27 words per half 
line of an A4 page, and 54 words per half line of A3. This requires 27 words x 12 half colors (6 colors odd 
and even) ■* 324 x 256-bit DRAM accesses over a dotline print time, equating to 6 bits per cycle (equal to 
DNC generate rate of 6 bits per cycle). Each half color is required to be double buffered, while filling one 
buffer the other buffer is being written to DRAM. This results in 256 biu x 2 buffers x 12 half colors i.e. 
6144 bits in total. 

The buffer requirement can be reduced, by using 1 ,5 buffering, where the DWU is filling 128 bits while the 
remaining 256 bits are being written to DRAM. While this reduces the required buffering locally it 
increases the peak bandwidth reqmrement to the DRAM, With 2x buffering the average and peak DRAM 
bandwidth requirement is the same and is 6 bits per cycle, alternatively with 1.5x buffering the average 
DRAM bandwidth requirement is 6 bits per cycle but the peak bandwiddi requirement is 12 bits per cycle. 
The amount of buffering used will depend on the DRAM bandwidth available to the DWU unit. 
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xl.S Buffering 



FIFO empty writing begins 



256 bits full, DRAM request Issued 



Variable cydes later DRAM request granted 
256 bits transferred 



256 bits fiifl. DRAM request issued again 



Variable cycles later ORAM request granted 
256 bits transferred 




x2 Buffering 



I write pt 
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I write pt 



I write pt 
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I write pt 
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I write pt 



Final 256 bits full, ORAM request issued 



Variable cydes later DRAM final request 
granted. 256 bits transferred 



write pt 



Tread pt 



readpt 



readpt 

Figure 217. Comparison of 1.5x v 2x buffering 



write pt 



readpt 



Should the DWU fail to get the required DRAM access within the specified time, the DWU will stall the 
DNC data generation. The DWU will issue the stall in sufficient time for the DNC to respond and still not 
cause a FIFO overrun. Should the stall persist for a sufficiently long time, the PHI will be starved of data 
and be unable to deliver data to the printhead in time. The sizing of the dotline store FIFO and internal 
FIFOs should be chosen so as to prevent such a stall happening. 



30.6 DOTUNE DATA IN MEMORY 



The dot data shift register order in the printhead is shown in Figure 214 (the transmit order is the opposite 
of the shift register order). In the example the type 0 printhead IC transmit order is increasing even color 
data followed by decreasing odd color data. The type 1 printhead IC transmit order is decreasing odd color 
data followed by increasing even color data. For both printhead ICs the even data is always increasing 
order and odd data is always decreasing. The PHI controls which printhead IC data gets shifted to. 

From this it is beneficial to store even data in increasing order in DRAM and odd data in decreasing order. 
While this order suits the example printhead, other printheads exist where it would be beneficial to store 
even data in decreasing order, and odd data in increasing order, hence the order is configiirable. The order 
that data is stored in memory is controlled by setting the CoiorLineSense register. 
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The dot . order in DRAM for increasing and decreasing sense is shown in Figure 218 and Figure 219 
respectively. For each line in the dot store the order is the same (although for odd lines the numbering will 
be different the order will remain the same). Dot data from the DNC is always received in increasing dot 
number order. For increasing sense dot data is bundled into 256-bit words and written in increasing order 
in DRAM, word 0 first, then word 1 , and so on to word N, where N is the number of words in a line. 

For decreasing sense dot data is also bundled into 256-bit words, but is written to DRAM in decreasing 
order, i.e. word N is written first then word N-1 and so on to word 0. For both increasing and decreasing 
sense the data is aligned to bit 0 of a word, i.e. increasing sense always starts at bit 0, decreasing sense 
always finishes at bit 0. 

Even Dot Storage In DRAM (Increasfng Sense) 
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Figure 218. Even dot order in DRAM (Increasing Sense, 13320 dot wide line) 



Even Dot Storage In DRAM (Decreasing Sense) 
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Figure 219. Even dot order in DRAM (Decreasing Sense, 13320 dot wide line) 



Eacli half color is configured independently of any other color. The ColorBaseAdr register specifies the 
position where data for a particular dotline FIFO will begin writing to. Note that for increasing sense col- 
ors the ColorBaseAdr register specifics the address of the first word of first line of the fifo, whereas for 
decreasing sense colors the ColorBaseAdr register specifies the address of last word of the first line of the 
FIFO. 

Dot data received from the DNC is bundled in 256-bit words and transferred to the DRAM. Each line of 
data is stored consecutively in DRAM, with each line separated by ColorLineInc number of words. 

For each line stored in DRAM the DWU increments the line count and calculates the DRAM address for 

the next line to store. 
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S3 



ColorBaseAdr 
(words) 



This process continues until ColorFifoSize number of lines are stored, after which the DRAM address with 
wrap back to the ColorBaseAdr address. 



Increasing Sense Colors 
DRAM 




Decreasing Sense Colors 
DRAM 



ColorSaseAdr 
(words) 



MaxWriteAhead (Unes) 



jr ColorFifoSize « N tines 




Rgure 220. Dotline FIFO data structure In DRAM 

As each hne is written to the FIFO, the DWU increments the FifoFillLevel register, and as the LLU reads a 
line from tiie FIFO the FifoFillLevel register is decremented The LLU indicates that it has completed 
reading a line by a high pulse on the llu^dwu^line^rd line. 

When the number of lines stored in the FIFO is equal to the MaxWriteAhead value the DWU will indicate 
to the DNC that it is no longer able to receive data (i.e. a stall) by deasserting the dwujncjready signal. 

The ColorEnable register detennines which color planes should be processed, if a plane is turned off, data 
is ignored for that plane and no DRAM accesses for that plane are generated. 
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30.7 Implementation 



30.7.1 Definitions of I/O 

Table 154. DWU I/O Definition 



Clocks and Resets 


pdk 


1 


In 


System Clock 


prst^n 


1 


In 


System reset, synchronous active low 


ONC Interface 


dwu_dnc_ready 


1 


Out 


Indicates that OWU Is ready to accept data from the DNC. 


dnc„dwu_ayail 


1 


In 


Indicates valkl data present on dnc_dwu_jiata. 


dnc_dwu_data(5X}] 


6 


In 


Input bMevel dot data in 6 ink planes. 


LLU Interfaee 


dwuJIu_Dne_wr 


1 


Out 


DWU nne write. Indicates that the OWU has oompreted a fuU 
line write. AeHye high 


llfu_dwuJJne^rd 


1 


In 


LLU line read. Indteates that the LLU has completed a tine 
read. Active high. 


LLU and DWU common configuration 


dwu_IUj.clH6size[1 1 .-OJ^.-C] 


12x8 


Out 


Indicates the number of lines in the FIFO before the One 

increment will wrap around in memory. 

Bus 0.1 • Even. Odd One colorO 

Bus 23 ' Even, Odd line cotor 1 

Bus 4.5 - Even. Odd line ooh>r 2 

Bus 6.7 - Even, Odd Hne color 3 

Bus 8.9 - Even, Odd line ook>r 4 

Bus 10,1 1 • Even, Odd fine color S 


PCU Interface 


pcu_dwu_sel 


1 


In 


Block select from the PCU. When pcu^dwu_setis high t>oih 
pcu.adr and pcu^datapufare valkJ. 


pcu„rwn 


1 


in 


Common read/not-write signal from the PCU. 


pcu_adr[7:2] 


6 


In 


PCU address bus. Only 6 bfts are required to decode the 
address space for this bk)ck. 


pcu_dataotit[3 1 :0] 


32 


in 


Shared write data t>us from the PCU. 


dwu_pcu_rdy 


1 


Out 


Ready signal to the PCU. When dwu,jKtcrdy Is high rt Indi- 
cates the last cyde of the access. i=br a write cycle this 
means pcuLCKataouf has been registered by the btock and 
for a read cycle this means the data on dwu.j)cuj(iata Is 
valkJ. 


dwu.pcu_data[31 :0] 


32 


Out 


Read data bus to the PCU. 


OIU Interface 


dwu_dhj_wreq 


1 


Out 


DWU requests DRAM write. A write request must be accom- 
panied by a valid write address together with valid write data 
and a write valid. 


dwu_diu_wadrf21 :5] 


17 


Out 


Write address to DIU 

17 bits wide (256-blt aligned word) 


diu_dwu.wack 


1 


in 


Acknowledge from DIU that write request has been 
accepted and new write address can be placed on 
dwu_dkuwadr 
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Table 154. DWU I/O Definition 





^1 






dwu_diu_data(63:0] 


64 


Out 


Data from DWU to D(U. 2S6-bit word transfer over 4 cycles 
Rrsi 64-bits Is bits 63:0 of 256 bit word 
Second 64-bits is bits 127:64 of 256 bit word 
Third 64-bits is bits 191:128 of 256 bit word 
Fourtri 64'bits is bits 255:1 92 of 256 bit word 


clwu_dju_wvaJid 


1 


Out 


Signal from DWU Indicating that data on dwu^diu data is 
valid. 
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S3 



30.7.2 DWU partition 
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Figure 221. DWU partition 



30.7.3 Configuration registers 

The configuratioii registers in the DWU are programmed via the PCU interface. Refer to section 21 .8.2 on 
page 257 for a description of the protocol and timing diagrams for reading and writing registers in the 
DWU. Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register 
reads and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for 
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the DWU. When reading a register that is less than 32 bits wide zeros should be returned on the upper 
unused bit(s} of dwu^cuJUita. Table 155 lists the configuration registers in the DWU. 



Table 155. DWU registers descrtptlon 




Control Registers 





0x00 


Reset 


1 


0x1 


Active low synchronous reset, self de^activating. A 
wrfte to this register will cause a DWU btock reset. 




0x04 


Go 


1 


0x0 


Active hfgh bit indicating the DWU is programmed 
and ready to use. A low to high transition will cause 
DWU block internal states to reset (configuration 
registers are not reset). 




Dot Line Store Configuration 




0x08-0x38 


ColorBaseAdr(11.*0] 


12x17 


0x00000 


Specifies the base address (In words) in memory 
where data from a particular half color (N) will be 
placed. 




0x30 -Ox6C 


Co!orRfoSlze[11:0J 


12x8 


0x00 


Indicates the number of tines in the FIFO before 
the line increment virill wrap around in memory. 
Bus 0.1 - Even, Odd line color 0 
Bus 2.3 - Even, Odd line color 1 
Bus 4.5 - Even, Odd Bne color 2^ 
Bus 6,7 - Even, Odd line color 3^ 
Bus 8.9 - Even, Odd line color 4 
Bus 10,11 - Even, Odd line color 5 




0x70 


ColorUneSense 


2 


0x2 


Specifies whether data written to DRAM for this 
half color is increasing or decreasing sense 

0 • Decreasing sense 

1 • Increasing sense 

Bit 0 Defines even color sense. 
Bit 1 Defines odd color sense. 




0x74 


ColorEnabie 


6 


Ox3F 


indicates whether a particular color is active or not 
When inacUve no data is written to DRAM for that 
color. 

0 - Color off 

1 - Color on 

One bit perootor. bit 0 is CotorOand so on. 


1 


0x78 


MaxWrtteAhead 


8 


0x00 


Specifies the maximum number of lines that the 
DWU can be ahead of the LLU 


1 


0x7C 


UneSize 


16. 


0X0000 • 


Indicates the number of dots per fine. 




Wdrking Registers 


1 


0x80 


UneOotOnt 


16 


0x0000 


IrKlicates the number of remaining dots In the cur- 
rent line. (Read Only) 


1 


0x84 


RfoRllLevel 


8 


0x00 


Numt>er of lines in the FIFO, written to txil not 
read. (Read Onl^ 



A low to high transition of the Go register causes the internal states of the DWU to be reset. All configura- 
tion registers will remain the same. The block indicates the transition to other blocks via the dwu jgo^pulse 
signal. 



The ColorLineInc bus specifies the number of addresses (in 256-bit words) between successive half lines 
in the dot line store. It is derived from the LineSize register by rounding up the nearest 256-bit value. The 
same value used for all half colors. 

if (llne.ai2e(7:0} !=0 ) then 

CQlor_line_incC7 :0] » line_size(lS :8) + 1 
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else 



color_line_inc[7 ;01 = line^sizeflS : 8) ; 



30.7.4 



Fifofill level 



The DWU keeps a running total of the number of lines in the dot store FIFO. Each time the DWU writes a 
line to DRAM (determined by the DIU interface subblock and signalled via linej^r) it increments the 
filllevel and signals the line increment to the LLU (puJse on dwujtlujine^wr). Conversely if it receives an 
active llu_dwu_line_rd pulse from the LLU, the filllevel is decremented If the filllevel increases to the pro- 
grammed max level (max_write_ahead) then the DWU stalls and indicates back to the DNC by de*assert- 
ing the dwu^dnc^ady signal. 

If one or more of the DIU biiffers fiil» the DIU interface signals the fill level logic via the hufjull signal 
which in turn causes the DWU to de-assert the dwuJLncjready signal to stall the DNC. The hufjull sig- 
nals will remain active until the DIU services a pending request from the lull buffer, reducing the buffer 
level. 

The DWU does not increment the fill level until a complete line of dot data is in DRAM not just a com- 
plete line received from the DNC. This ensures diat the LLU cannot stait reading a partial line from 
DRAM before the DWU has finished writing the line. 

The fill level is reset to zero each time a new page is started, on receiving a pulse via the dwu_gq_pulse 
signal. 

The line fifo fill level can be read by the CPU via Uie PCU at zixy time by accessing the FifoFlULevel regis- 
ter. 
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30.7.5 Buffer address generator 



<lwu_go_pulse- 
dncjdwu.avaiH 



dwu_dnc_rea<ty-^ 



line.8l2B 



16. 



Up count 
generator 



Dot 

counter 



line_fin 



COtorJlno. 
UPCOt \i 



iO]' 



Down count 
generator 



colorjhe_sense(l )- 



dnc,d¥OTi_avaa 



dnc_dwu_data ■ 



data„acthfQ 



Even 

bit-write 

decode 



1 



wr_«n|0) 
•^«bft|0I63:0] 



-/^ ►wr_adil01I3:0] 



line.fln 



Odd 

bit-write 

decode 



-7^ ► wJjKt1Il63:01 



-> wr_adr(1I3:0) 



wr_dotj(iaia 



Figure 222. Buffer address generator 5ut>-block 

30.7.5.1 Buffer address generator description 

The buifer address generator subblock is responsible for accepting data from the DNC and writing it to the 
DIU buffers in the correct order. 

The buffer address and active bit-write for a particular dot data write is calculated by the buffer address 
generator based on the dot count of the current line, programmed sense of the color and the line size. 

All configuration registers should be programmed while the Go bit is set to zero, once complete the block 
can be enabled by setting the Go bit to one. The transition from zero to one will cause the internal states to 
reset 

If the color_line^sense signal for a color is one (i.e. increasing) then the bit-write generation is straight 
forward as dot data is aligned with a 256-bit boundary. So for the fim dot in that color, the bit 0 of the 
wr_bit bus will be active (in buffer word 0), for the second dot bit 1 is active and so on to the 255* dot 
where bit 63 is active (in buffer word 3). This is repeated for all 256-bit words until the final word where 
only a partial number of bits are written before the word is transferred to DRAM. 

If color_Une_sense signal for a color is zero (i.e. decreasing) the bit-write generation for that color is 
adjusted by an offset calculated from the pre-programmed line length {line^ize). The offset adjusts the bit 
write to allow the line to finish on a 256-bit boundary. For example if the line length was 400, for the first 
dot received bit 7 (line length is halved because of odd/even lines of color) of the wrjbii is active (buffer 
word 3). the second bit 6 (buffer word 3), to the 200* dot of data with bit 0 of wrjbit active (buffer word 
0). 
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30.7.5.2 Bit-write decode 



The buffer address generator contains 2 instances of the bit-write decode, one configured for odd dot data 
the other for even. The counter (either up or down counter) used to generate the addresses is selected by 
the colorjinejsense signal. Each block determines if it is active on tiiis cycle by comparing its configured 
type with the current dot count address and the datajactive signal. 

The wrjbit bus is a direct decoding of the lower 6 count bits {count f 6: 1 J), and the DIU buffer address is 
the remaining higher bits of the counter (countflO:?]). 

The signal generation is given as follows: 
// determine the counter to use 
if (color^line^sense == 1 ) 

count 2= up_cnt(10:01 
else 

count = dn_cnt[10:0J 
// determine if active, baaed on instance type 

wr_en = data_active & (count [0] odd_even_type) // odd =1. even cO 

// determine the bit write value 

wr_bitt63:01 « decode (count [6 : 13 ) 

// determine the buffer 64-bit address 

wr_adr[3:0] » count (10:7 J 



30.7.5.3 Up counter generator 

The up counter increments for each new dot and is used to determine the write position of the dot in the 
DIU buiffers for increasing sense data. At the end of each line of dot data (as indicated by line Jin), the 
counter is roimded up to the nearest 256-bit word boundary. This causes the DIU buffers to be flushed to 
DRAM including any partially filled 256-bit words. The counter is reset to zero if the dwu_go_pulse is 
one. 

// Up-Counter Logic 

if (dwu^o_jpulse == 1) then { 

up_^cnt (10 :0] = 0 
elsif (line.fin == 1 ) then 

// round up 

if (up_cntI8il] != 0) 
up_cnt (10:91++ 

else 

up_cnttl0:9) 

// bit-selector 

up_cnt [7:0] =0 

elsif { (dnc_dwu_avail == 1) AMD (dwLL-c2nc_ready 1 ) ) then 
up_cnt (7:01++ 



30.7.5.4 Down counter generator 

The down counter logic decrements for each new dot and is used to determine the write position of the dot 
in the DUI buffers for decreasing sense data. When the dwu^go^pulse bit is one the lower bits (i.e. 8 to 0) 
of the coimter are reset to line size value (line^ize), and the higher bits to zero. The bits used to determine 
the bit-write values and 64-bit word addresses in the DIU buffers begin at line size and count down to zero. 
The remaining higher bits are used to detennine the DIU buffer 256-bit address and buffer fill level, begin 
at zero and count up. The counter is active when valid dot data is present, i.e. dnc_dwujavail equals 1 . 

When the end of line is detected (line Jin equals I) the counter is rounded to the next.256-bit word, and the 

lower bits are reset to the line size value. 

//Down>Counter Logic 

if (dwu_go_pulse ==1) then 
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<in_cntC8:0) = line_size[8 :0) 
dn_cnt(i0:9} = 0 
els if (line_fin == 1 ) then 
// perform rounding up 
if (dn_cnt[8:13 != 0) 

dn_cnt[10:9]++ 
else 

dn_cnt (10:9) 
// bit-select is reset 

dn_cntr8: 0] =line_8izet8;0J // bit select bits 
elsif ( (dnc_dwu_avail 1) AND (d%AJ_dnc.ready ss= i > ) then 
dn_cntC8;0J — 
dn.cnt[10:9)+* 



30.7.5.6 Dot counter 

The dot counter simply counts each active dot received from the DNC. It sets the counter to line jsize and 
decrements eadi time a valid dot is received. When the count equals zero the line Jin signal is pulsed and 
the counter is reset to line^ize. 

The counter is reset to line^ize when dwu^o_pulse is 1. 
30.7.6 DIU buffer 

The DIU buffer is a 64 bit x 8 word dual port register airay with bit write capability. The buffer could be 
implemented with flip-flops should it prove more efficient 
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30.7.7 DIU interface 
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line_wr 



dwuLdiujdata 



External 



Figure 223. OiU interface sub-block 



30. 7. 7. f DiU interface general description 

The DIU interface detennines when a buffer needs a data word to be transferred to DRAM. It generates the 
DRAM address based on the dot line position, die color base address and the other programmed 
ters. A write request is made to DRAM and when acknowledged a 256-bit data word is transferred. The 
interface determines if fiiither words need to be transferred and repeats the transfer process. 

If the FIFO in DRAM has reached its maximum level, or one of the buffers has temporarily filled, the 
DWU will stall data generation from the DNC. 

A similar process is repeated for each line until the end of page is reached. At the end of a page the CPU is 
required to reset the internal state of the block before the next page can be printed. A low to high transition 
of the Go register will cause the internal block reset, which causes all registers in the block to reset with 
the exception of the configuration registers. The transition is indicated to subblocks by a pulse on 
dwu^go^ulse signal. 
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30.7.7.2 Interface controller 



Reset OR dwu oo pulsa =1 



color cnt ~fi 
group^fin = 1 
recuupdate = 1 



Idle 



3 



req.updatesl 



color jcnt-i-i- 
adr.update « 1 



cofof enaMefcoter cnt1=0 A 

color cntsfi 

co!or_cnt++ 



color enablefcnt^ssl & oofor cnt < 6 



^ Request ^ dwu.diu^wreqsl : 



.fd_en b1 



.en si 




buf_rd_en «1 



Machine remains in same state by default 
All outputs are zero unless otherwise stated 

State Description: 

Idle : idle state wait for active request 

ColorSelect: Select the color to update t)et6re 
requesting to DIU 

Request: Request issued wait for adcnowledge 

DataO: Data word 0 transfer 

Datal : Data word 1 transfer 

Data2: Data word 2 transfer 

Data3: Data word 3 transfer 



.en »1 



^buf^rd. 

Figure 224. interface controller state diagram 



The interface controller state machine waits in Idle state until an active request is indicated by the read 
pointer (via the req^acHve signal). When an active request is received the machine proceeds to the Col- 
orSelect state to determine which buffers need a data transfer. In the ColorSelect state it cycles through 
each color and detcnnines if the color is enabled (and consequently the buffer needs servicing), if enabled 
it jumps to die Request state, otherwise the color^cnt is incremented and the next color is checked. 

In the Request state the machine issues a write request to the DIU and waits in the Request state until the 
write request is acknowledged by the DIU {diu_dwu_\sfack). Once an acknowledge is received the state 
machine clocks through 4 cycles transfening 64-bit data words each cycle and incrementing the corre- 
sponding buffer read address. After transferring the data to the DIU the machine returns to the ColorSelect 
state to determine if further buffers need servicing. On the transition the controller indicates to the address 
generator {fldr_update) to update the address for that selected color. 

If all colors are transferred (color^cnt equal to 6) the state machine returns to Idle, updating the last word 
flags (group Jin) and request logic {req^update). 

The dwu_diu^wvalid signal is a delayed version of the buf_rd_en signal to allow for pipeline delays 
between data leaving the buffer and being clocked through to the DIU block. 

The state machine will return from any state to Idle if the reset or the dwu _go _jmlse is 1 . 
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30.7.7.3 Address generator 

The address generator block maintains 12 pointers {color_adr[1 1:0J) to DRAM corresponding to current 
write address in the dot line store for each half color. When a DRAM transfer occurs the address pointer is 
used first and then updated for the next transfer for that color. The pointer used is selected by the req^sel 
bus, and the pointer update is initiated by the adrjupdate signal firom the interface controller. 

The pointer update is dependent on the sense of the color of that pointer, the pointer position in a line and 
the line position in the FIFO, The programming of the colorJbase_adr needs to be adjusted depending of 
the sense of the colors. For increasing sense colors the colorJbase_fldr specifics the address of the first 
word of first line of the fifo, whereas for decreasing sense colors the colorjbase^adr specifies the address 
of last word of the first line of the FIFO. 

For increasing colors, the initialization value (i.e. when dwm^go _pulse is 1) is the colorJbase_adr For 
each word that is wrinen to DRAM the pointer in incremented. If the word is the last word in a line (as 
indicated by last^wd firom that read pointers) the pointer is also incremented If the word is the last word in 
a line, and the line is the last line in the FIFO indicated by fifo^end from the line counter) the pointer is 
reset to colorjxzsejadr. 

In the case of decreasing sense colors, the initialization value (i.e. when dwu _go _pulse is 1) is tiie 
colorJbasejBuin For each line of decreasing sense color data the pointer starts at the line end and decre- 
ments to die line start. For each word that is written to DRAM the pointer is decremented. If the word is 
the last word in a line the pointer is incremented by colorjinejnc * 2 + 1. One line length to account for 
the line of data just written, and another line lengdi for the next line to be written. If the word is the last 
word in a line, and the line is the last line in the FIFO the pointer is reset to the initialization value (i.e. 
color Jbase_adr), 

The address is calculated as follows: 

if (dwu_go_pul8e == 1) then 

color_adr[ll:0] = color_base_a<trCll:0n2a:51 
eXsif (adrjupdate «= 1> then { 

// determine the color 

color V. re<a_sel[3:0] 

// line end and fifo wrap 

if ( {fifo_end[ color) 1) AND (last_wd 1)) then { 
// line end and fifo %nrap 

color_adrCcolorl = color_baao_adr tcolor] [21 :51 
) 

elsif ( last^vid == I) then < 

// just a line end no fifo «rrap 

if Ccolor.line.eenseCcolor % 2) == 1) then // increasing sense 
color^adr [color] 4-4- 

// decreasing sense 
color.adr (color] • color^adr [color] 4- ( color_line inc • 2] 4- l 

} 

else { 

// regular word write 

if (color_line.sense (color % 2) == 1) then // increasing sense 
color^adr (color] 4-4- 

else // decreasing sense 

color.adr (color) — 

) 

) 

// select the correct address « for this transfer 
<*wu-diu_wadr = color_adr ( req_sel ) 
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30.7.7.4 Line count 

The lin^ counter logic counts the number of dot data lines stored in DRAM for each color. A separate 
pointer is maintained for each colon A line pointer is updated each time the final word of a Ime is trans- 
ferred to DRAM. This is determined by a combination of adr_update and iastjwd signals. The pointer to 
update is indicated by the req^el bus. 

When an update occurs to a pointer it is compared to zero, if it is non-zero the count is decremented, oth- 
erwise the counter is reset to colorjifo^size. If a counter is zero the fifo^end signals is set high to indicates 
to the address generator block that the line is the last line of this colors fifo. 

If the dwujo_pulse signal is one the counters are reset to eolorjyb_size. 

if (di(ini^o_pulse w 1} then 

line.cntCll:0) « color_f ifo_si2e[ll:0] 
elsif ( (adr.Mpdate == 1) AND (last_wd ==1)) then { 
// determine the pointer to operate on 
color = req_sel(3:0) 
// update the pointer 
i£ (line.cnt [color) =■ 0) then 

line_cnt (color] = color^f if o_si2et color) 
else 

line_cnt(i) — 

•) 

// count ia zero its the last line of fifo 
for<ieO ;i <12;i**) ( 

fifo_endCi) = (line_cntri) == 0) 

) 

30.7.7.5 Read Pointer 

The read pointer logic maintains the buffer read address pointers. The read pointer is used to detennine 
which 64-bit words to read from the buffer for transfer to DRAM. 

The read pointer logic compares the read and write pointers of each DIU buffer to detennine which buffers 
require data to be transferred to DRAM (pend[ll:0] bus), and which buffeis are full (the hufjull signal). 
Only enabled buffers are considered as indicated by the color^enable bus. 

Buffers are grouped into odd and even buffers groups. If an odd buffer requires DRAM access the 
oddjtend signals will be active, if an even buffer requires DRAM access the even^end signals will be 
active. If both odd and even buffers require DRAM access, the even buffets will get serviced first 

If any buffer requires a DRAM transfer, the logic will indicate to the interface controller via the req^active 
signal, with the odd^jsven^el signal determining which group of buffers get serviced. The inter&ce con- 
troller will check the color ^enable signal and issue DRAM transfers for all enabled colors in a group. 
When the transfers are complete it tells the read pointer logic to update the requests pending via 
reqjupdate signal. 

The reqjsel[3:0J signal tells the address generator which buffer is being serviced, it is constructed from 
the oddjsven^el signal and the color _cnt [2:0] bus from the interface controller. When data is being trans- 
ferred to DRAM the word pointer and read pointer for the corresponding buffer are updated. The req^fie! 
determines which pointer should be incremented. 
// determine which buffers need updates 
for( i=0; i<12; if+) < 

// detennine if re<iuest is active, filtered by color enable 

if ( %«:_adrCiJ 13:2J != rdLadr [i J (3 : 2 J ) 
pendCl) = color.enableCi / 2] 

else 

pend(i] = 0 
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// detexmine if any enabled buffer is full 

if ((wr_adr(il (3:0J - r4_adr [ i) (3 : 0] ) > 7) AND (color_enableti / 21 « 1)) then 
buf^full = 1 

) 

// Odd half colors {1,3.5,7.9,11). even half colors (0,2.4,6,6,10) 
oddjend = ( pend(l) | pend[3) | pend(5) | pend[7] | pend(9J j pendtll] ) 
even^end = ( pend(0| | pend{2] j pend[4J j pend[6] j pendCSl | pendllO} ) 
// fixed servicing order, only update when controller dictates so 
if (req_update == 1) then { 

if (evesupend == 1) then // even always first 

odd^even^sel s O 
recz.active = 1 
elsif (oddjend -= 1 ) then // then checJc odd 
odd^even^sel « 0 
req^active * i 
else // nothing active 

od<l_even_sel = 0 
re(i_actlve = 0 

) 

// selected requestor 

requsel[3:0I := (color^cnt (2 :0) , oddLeven^sel) // concatentation 

The read address pointer logic consists of 12 2-bit counters and a word select pointer. The pointers are 
reset when dwu^ojulse is one. The word pointer (word^tr) is common to all buffers and is used to read 
out tiie 64-bit words from the DIU buflfcn It is incremented when tufj-duen is active. If the word jjtr is 3 
and 1he buLrd_en is active the selected read pointer (rd _ptr[req_selj) will be incremented. A concatena- 
tion of the read pointer and tiie word pointer are use to construct the buffer read address. The read pointers 
are not reset at the end of each line. 

// determine which pointer to update 
if {dwu_go_pulse == 1) then 

rd_ptr(ll:0) = 0 

word_j)tr «= 0 

elsif (buf_rd__en == 1) then { 

word_j5tr++ 

if (word^ptr == 3 ) then 
rd_ptr [ req^sel ) 

) 

// create the address from the pointer, and word reader 
rd»adrtre<L_sel3 {rcLptr£re<i_sel) ,wordj)tr) // concatenation 

The read pointer block detennines if tiie word being read from the DIU buffers is the last word of a line. 
The buffer address generator indicate the last dot is being written into the buffers via the line Jin signal. 
When received the logic marks the 256-bit word in the buffers as the last word. When the last word is read 
from the DIU buffer and transferred to DRAM, the flag for that word is reflected to the address generator. 

// line end set the flags 
if (dwu^go_pulse == 1) then 

last_flagCl:0] (1:0] =0 
elsif (line_fin 1 ) then 

//"determines the ciirrent 256-bit word even been written to 

last_f lag[0) (wr^adrtOJ {21] ^ 1 // even group flag 

// determines the current 2 56 -bit word odd been written to 

last_flag(13 (wr^adrfU [21 J =1 // odd group flag 
// last word reflection to address generator 
last_wd = la8t_flag(odd_even^sel] [rd^tr(rea.8einO) ] 
// clear the flag 
if (group_fin 1 ) then 

last„Clag(odd_even_6el] (rd_jptr [reqL.selJ con « 0 

When a complete line has been written into the DIU buffers (but has not yet been transferred to DRAM), 
the buffer address generator block will pulse flie line Jin signal. The DWU must wait until all enabled 
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buffers arc transferred to DRAM before signaling the LLU that a complete line is available in the dot line 
store {dwujlujine^wr signal). When the line Jin is received all buffers will require transfer to DRAM. 
Due to the arbitration, the even group will get serviced first then the odd. As a result the line finish pulse to 
the LLU is generated from the iastjiag of the odd group. 
-// must be odd, odd group cransfer complete and the last word 
dwu_llu_line_wr odtfLeven^sel AMD group_Cin AND last.wd 
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31 

31.1 



Line Loader Unit (LLU) 



Overview 



The Line Loader Unit (LLU) reads dot data from the line buffers in DRAM and structures the data into 
even and odd dot channels destined for the same print time. The blocks of dot data are transferred to the 
PHI and then to the prinlhead. Figure 225 shows a high level data flow diagram of the LLU in context. 



owu- 



dot data ^ 



DRAM 
via Dili 



dot data 



LLU 
"I— 



dot data 



PHI 



Rgure 225. High level data flow diagram of LLU in contesct 



31 .2 Physical requireiment iimposed by the printheao 

The DWU re-orders dot data into 12 separate dot data line FIFOs in the DRAM. Each FIFO corresponds to 
6 colors of odd and even data. The LLU reads the dot data line FIFOs and sends the data to the printhead 
inter&ce. The LLU decictes when data should be read from the dot data line FIFOs to correspond with die 
time that the particular nozzle on the printhead is passing the current line. The interaction of the DWU and 
LLU with the dot line FIFOs compensates for the physical spread of nozzles firing over several lines at 
once. For further explanation see Section 30 Dotline Writer Unit (DWU) and Section 32 PrintHead Inter- 
face (PHI). Figure 226 shows the ph3rsical relationship of nozzle rows and the line time the LLU starts 
reading from the dot line store. 
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Figure 226. Paper and printhead nozzles relationship (example witti D^^Dz^S) 
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Wthin each line of dot data the LLU is required to generate an even and odd dot data stream to the PHI 
block. Figure 227 shows the even and dot streams as they would map to an example bi-lithic printhead. 
The PHI block determines which stream should be directed to which printhead IC. 
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Figure 227. Printhead structure and dot generate order 
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31 .3 Dot generate and transmit order 

The structure of the printhead ICs dictate the dot transmit order to each printhead IC. The LLU reads data 
from the dot line FIFO, generates an even and odd dot stream which is then re-ordered (in the PHI) into the 
transmit order for transfer to the printhead. 

The DWU separates dot data into even and odd half tines for each color and stores them in DRAM. It can 
store odd or even dot data in increasing or decreasing order in DRAM, The order is progranunable but for 
descriptive purposes assume even in increasing order and odd in decreasing order. The dot order structure 
in DRAM is shown in Figure 219. 

The LLU contains 2 dot generator units. Each dot generator reads dot data from DRAM and generates a 
stream of odd or even dots. The dot order may be increasing or decreasing depending on how the DWU 
was programmed to write data to DRAM. An example of the even and odd dot data streams to DRAM is 
shown in Figure 228. In the example the odd dot generator is configured to produce odd dot data in 
decreasing order and the even dot generator produces dot data in increasing order: 

The PHI block; accepts the even and odd dot data streams and reconstructs the streams into transmit order 
to the printhead 

The LLU line size refers to the page width in dots and not necessarily the printhead width. The page width 
is often the dot maigin number of dots less than the printhead width. They can be the same size for full 
bleed printing. 
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Figure 228. Dot data gerterated and transmitted order 



31.4 LLU START-UP 



At the start of a page the LLU must wait for the dot line store in DRAM to fill to a configured level (given 
by FifoReadThreshold) before starting to read dot data. Once the LLU starts processing dot data for a page 
it must continue until the end of a page, the OWU (and other PEP blocks in the pipeline) must ensure there 
is alv/ays data in the dot line store for the LLU to read, othenvise the LLU will stall, causing the PHI to 
stall and potentially genmte a print error. The FifoReadTTireshold should be chosen to allow for data rate 
mismatches between the DWU write side and the LLU read side of the dot line FIFO. The LLU will not 
generate any dot data until FifoReadThreshold level in the dot line FIFO is reached. 

Once the FifoReadThreshold is reached the LLU begins page processing, the FifoReadThreshold is 
ignored from then on. 

When the LLU begins page processing it produces dot data for all colors (although some dot data color 
may be null data). The LLU compares the line coimt of the current page, when the line count exceeds the 
ColorRelLim configured value for a particular color the LLU will start reading horn that coloi^ FIFO in 
DRAM. For colors that have not exceeded the ColorRelLine value the LLU will generate null data (zero 
data) and not read from DRAM for that color. ColorRelLine [N] specifies the number of lines separating 
the li^ half color and the first half color to print on that page. 

For the example printhead shown in Figure 226, color 0 odd will start at line 0, the remaining colors will 
all have null data. Color 0 odd will continue with real data until line 5, when color 0 odd and even will 
contain real data the remaining colors will contain null data. At line 10, color 0 odd and even and color 1 
odd will contain real data, with remaining colors containing nidi data. Every 5 lines a new half color will 
contain real data and the remaining half colors null data until line 55, when all colors udll contain real 
data. In the example Color RelUne[0] «5. ColorRelLine [1] =0, ColorRelLine p] =*15, ColorRelLine [3] 
=10.. etc. 
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It is possible to turn off any one of the color planes of data (via the ColorEnable register), in such cases the 
LLU will generate zeroed dot data information to the PHI as normal but will not read data from the 
DRAM. 



31.4.1 LLU bandwidth requirements 



The LLU is required to generate data for feeding to the printhead interface, the rate required is dependent 
on the printhead construction and on the line rate configured. The maximum data rate the LLU can pro- 
duce is 12 bits of dot data per cycle, but the PHI consumes at 12 bits per /?A/c/it cycle (2/3 />c/itrate), i.e. 8 
bits pcxpclk QTcle. Therefore the DRAM bandwidth requirement for a double buffered LLU is 8 bits per 
cycle on average. If 1.5 buffering is used then the peak bandwidth requirement is doubled to 16 bits per 
cycle but the average remains at 8 bits per cycle. Note that while the LLU and PHI could produce data at 
the 8 bits per cycle rate, the DWU can only produce data at 6 bits per cycle rate. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



^ Nov 2002 
Page 487 



SoPEC : Hardware Design 



31 .5 Implementation 

31.5.1 LLU partition 



1Z 

liu.dhi_radr 

64 

diM_data -7^ 

dhj_IIu_fvarid — 
dlu_nu_rack — 



nu_flo_pu!se 



12x8 



DtU 

Interface 



I 



wr_data 



64 



wr_en 



wr_adr 



x6 



DIU 
J Buffer 



Swords 
x64bits 



fd_data 



rd_adr 



bul_emp ^ ^ 



wr_data 



wr_adr 



x6 



DIU 
Buffer 

Swords 
x64i^ts 



rd_ 



exft4 



buf^emp ^ ^ 



llu.en 



4i- 



I 



Configuration 
registers 



Ou_go_pu1s6 



i 8. 1 § 1 1 

null 



4^ 



FIFO 
FfU Level 



n 



I 

i' 



Even dot 
generator 



Odd dot 
generator 



Figure 229. LLU partition 
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31 .5.2 Definitions of I/O 

Table 156. LLU I/O definition 









Clocks and Resets 


pdk 


1 


rn 


System dock 


prat_n 


1 


In 


System reset, synchronous active low 


PHI Interface 
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Table 156. LLU VO definition 











lluj>hLdala[l:0][5:0] 


2x6 


Out 


Dot Data from LLU to the PHI, each bit is a color plane 5 downto 0. 
Bus 0 • Even dot data stream 
Bus 1 • Odd dot data stream 

Data Is active when corresponding bit (s active in flu .phi^avaii bus 


phLUu.ready[1K)] 


2 


In 


Indicates that PHI is ready to accept data from the LLU 

0 • Even dot data stream 

1 •Odd dot data stream 


l»u^hLavan[1:0] 


2 


Out 


Indicates vaBd data present on corresponding ttu_phi^data. 

0 - Even dot data stream 

1 - Odd dot data stream 


OIU rnterface 


llu_dfu_rreq 


1 


Out 


LLU requests DRAM read. A read request must be accompanied 
by a valid read address. 


Ilu_diu_ra(lr(21 S] 


17 


Out 


Read address to DIU 

17 bits wide (256-bit aligned wond). 


diujlu^mck 


1 


In 


Acknowledge from DIU that read request has been accepted and 
new read address can be placed on Bujdiu^fadr 


diu_data(63:0] 


64 


In 


Data from DIU to LLU. Each access is 256-blts received over 4 
dodc cydes 

Rrst 64-bfts is bits 63:0 of 256 bit word 
Second 64-bits Is bits 127:64 of 256 bit word 

Third 6443fts Is bits 1 91 :1 28 of 256 bit word 
Fourth 64-Wts is bits 255:192 of 256 bit word 


diu_flu_rva(id 




In 


Signal from DIU telling LLU that valid read data is on the diujiSata 
bus 


OWU Interface 


dwujfujine_wr 




In 


DWU line write. Indicates that the DWU has completed a full fine 
write. Active high 


Uu.dwujine.rd 




Out 


LLU Una read. Indicates that the LLU has completed a line read. 
Active high. 


dwu.llu jcfifostze[1 1 :0][7:0] 


12x8 


In 


Indicates the number of lines in the FIFO betore the line increment 
wUI wrap around in memory. 


PCU Interface 


pcujlu.sel 




In 


Block select from the PCU. When pcu^Hu^seiis high both pcujaOr 
and pcujdataoutBiB valid. 


pcu_rwn 




In 


Common read/not-write signal from the PCU. 


pcu_adrr7:2] 


6 


In 


PCU address bus. Only 6 bits are required to decode the address 

space tor this bkx:k. 


pcu_dataoutt31:0) 


32 


In 


Shared write data bus from the PCU. 


llu_pcu_rdy 


1 


Out 


Ready signal to the PCU. When Uu _pcu_rcty\s high it indicates the 
last cyde of the access. For a wrrto cyde this means pcu_dataout 
has been registered by the block and for a read cycle this means 
the data on llu_jKU_€lata is valid. 


llu_pcu_data[31:0] 


32 


Out 


Read data bus to the PCU. 



31.5.3 Configuration registers 

The configuration registers in the LLU are programmed via the PCU interface. Refer to section 21.8.2 on 
page 257 for a description of the protocol and timing diagrams for reading and writing registers in the 
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LLU. Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register reads 
and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the 
LLU. When reading a register that is less than 32 bits wide zeros should be returned on the upper unused 
bit(s) of Hujpcu^daia. Table 1 57 lists the configuration registers in the LLU. 



Table 157. LLU registers description 



^^^^^^ 




M. 






Control Reglsl 




0x00 


Reset 


1 


0x1 


Active low synchronous reset, self de-activating. A 
write to this register wiQ cause a LLU block reset 


0x04 


Go 


1 


0x0 


Active high bit Indicating the U.U Is programmed and 
ready to use. A low to high transition wQI cause LLU 
block internal states to reset 


Configuration 




OX08 * 0X38 


Cofor8aseAdrI11:0] 


12x17 


0x0000 
0 


Specifies the base address Qn words) in memory 
where data from a particular half color (N) will be 
placed. 


0x3C 


ColorEnaUe 


6 


0x3F 


Indicates whether a particular color is active or not. 
VWien Inactive no data is written to ORAM for that 
color. 

O-Coloroff 
1 - Color on 

One bit per color, t)it 0 is Color 0 and so on. 


0x40 


UneSize 


16 


0x0000 


Indicates the number of dots per line. 


0x44 


FffoReatfThreshoId 


a 


0x00 


Specifies the mimber of lines that should be in the 
RFO before the U.U starts reading. 


0x48-0x78 


ColorRelUne[11:0] 


12x8 


0x00 


Specifies the relathre number ^ Ines to wait from the 
first before starting to read dot data from the corre- 
sponding dot data FIFO 
Bus 0.1 - Even. Odd One color 0 
Bus 2,3 - Even, Odd line cofor 1 
Bus 4^ -Even. Odd fine color 2 
Bus 6.7 - Even, Odd fine color 3 
Bus 6,9 - Even, Odd fine color 4 
Bus 10,11 - Even, Odd line color 5 


Wbrfcfng Registers 


0x7C 


RfoFdlLovel 


8 


0x00 


Number of lines in the dot line FIFO, line written in but 
not read out (Read Only) 



A low to high transition of the Go register causes the internal states of the LLU to be reset All configura- 
tion registers will remain the same. The block indicates the transition to other blocks via the llu^o _jnilse 
signal. 



The ColorLineInc bus specifies the number of addresses (in 256-bit words) between successive half lines 
in the dot line store, is used to determine when a half line of data is read from DRAM, It is derived from 
the LineSize register by rounding up the nearest 256-bit value. The same value used for all half colors, 
if <line_fiizer7:0] 1=0 ) then 

color_line_incC7:0J = line.size(15:8) ♦ 1 
else 

color_lxne_inc{7:0) = line^sizetlS : 8 J ? 
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31 .5.4 Dot generator 



bu1_emp X » 
Uu_en 0 

iru_go_puls8 ^ 

phijftj.1 



Rne.sizB 



Dot count 



dOt._BCtiVB 



►dot_avail 



Externa! Array ^\ 



DIU 
buffers 



^ — ^ doUdata 



x6 



Figure 230. Dot generator RTL Diagram 



The dot generator block is responsible for reading dot data from the DIU buffers and sending the dot data 
in the correct order to the PHI block. The dot generator waits for llu^en signal from the fifo fill level block, 
once active it starts reading data from the 6 DIU buffers and generating dot data for feeding to the PHI. 

In the LLU there arc two instances of the dot generator, one generating odd data and the other generating 
even data. 

At any time the ready bit from the PHI could be de-asserted, if this happens the dot generator will stop 
generating data, and wait for the ready bit to be re-asseited. 



31.5,4.1 Dot count 

In normal operation the dot counter will wait for the ilu^en and the ready to be active before starting to 
count. The dot count will produce data as long as the phijlu^ready is active. If the phijlu^ready signal 
goes low the count will be stalled. 

The dot counter increments for each dot that is processed per line. It is used to determine the line finish 
position, and the bit select value for reading from the DIU buffers. The counter is reset after each line is 
processed (fine Jin signal). It determines when a line is finished by comparing the dot count with the con- 
figured line size divided by 2 (note that odd numbers of dots will be rounded down). 

// define t:he line finish 
I if <dot_cntU4:0] line_sizeC15: 1 J )chea 

line_fin = 1 
else 

line_fin = 0 
// determine if word is v&lid 

doc^active = ( (llu^en =:= l) AND (phi_llu_ready == 1) AND (buf_env 0)) 
// counter logic 
if (llu_go_pulse 1) then 
dot_cnt = 0 

elsif ((dot_active == 1>AND {line_fin 1)) then 

dot_cnt e 0 
elsif (detractive »= 1) then 

dot^cnc « dot^cnt •»■ 1 
else 

dot_cnt j= dot_cnt 
// calculate the word select bits 
bit_sell5:0] dot_cnt[5:0] 
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The dot generalor aJso maintains a read buffer pointer which is incremented each time a 64-bit word is 
processed. The pointer is used to address the correct 64-bit dot data word within the DIU buffers. The 
pointer is reset when llu^o^ulse is L Unlike the dot counter the read pointer is not reset each line but 
rounded up the nearest 256-bit word. This allows for more efficient use of the DIU buffers at line finish. 

// read pointer logic 
if (llu^go^pulse 1) then 
reacL_adr = 0 

elsif (( dot_active == 1) AND (dot.cnt C 5 : 0 ] = 63 ) ) then 

reacS^adr ♦+ // normal Increment 

elsiC (( detractive =» 1) AND (line_fin 1 ) ) then { 
// fipecial end of line case 
if (dot_cntC7;01 != 0) then 

read_adr{3:2) // end of line round up 

read_adr(l:0] = 0; 

) 



31.5.5 Fife fill level 



The LLU keeps a running total of the number of lines in the dot line store FIFO. Every time the DWU sig- 
nals a line end {dwujlujine^wr active pulse) it increments the fillleveL Conversely if the LLU detects a 
line end (line_rd pulse) thcjiliievel is decremented and the line read is signalled to the DWU via die 
liu_fiwu_Jme_ni signal. 

The LLU fill level block is used to determine when the dot line has enough data stored before the LLU 
should begin to start reading. The LLU at page start is disabled It waits for the DWU to write lines to the 
dot line FIFO, and for the fill level to increase. The LLU remains disabled until the fill level has reached 
the programmed threshold ififo_read_thres\ When the threshold is reached it signals the LLU to start pro- 
cessing the page by setting lluj&n high. Once the LLU has started processing dot data for a page it will not 
stop if the filllevel falls below the threshold. 

The line fifo fill level can be read by the OPU via the PCU at any time by accessing the FifoFillLtCvd regis- 
ter. The CPU must toggle the Go register in the LLU for the block to be correctly initialized at page start 
and the fifo level reset to zero. 



if (llu_go_pulsc 1) then 
filllevel = 0 

elsif {(line_rd »« 1) AND (dwu_llu_line_'wr ==1)) then 

//do nothing 
elsif (line_rd == 1) then 

filllevel -- 
elsif (dwu_llu_line_wr == 1) then 

filllevel ++ 

// determine the threshold, and set the IXU going 
if (llu_go_pulse « 1) then 
llu_en = 0 

elsif {filllevel == f ifo_jread_threshold ) then 
llu_en = 1 
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31.5.6 DIU Interface 
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Figure 231. DIU interface 



3i.5.6.1 DfU interface description 

The DIU interface block is responsible for determining when dot data needs to be read from DRAM, keep- 
ing the dot generators supplied with data and calculating the DRAM read address based on configured 
parameters, FIFO fill levels and position in a line. 

The fill level block enables DIU requests by activating liu_en signal. The DOT interface controUcr then 
issues requests to the DIU for the LLU buffers to be filled with dot line data (or fill the LLU buffers with 
null data without requesting DRAM access, if requited). 

At page start the DIU interface determines which buffers should be filled with nuU data and which should 
request DRAM access. New requests are issued until the dot line is completely read fi-om DRAM. 
^^^f^ request to the DRAM the address generator calculates where in the DRAM the dot data should be 
read from. The color^enable bus determines which colors are enabled, the interfece never issues DRAM 
requests for disabled colors. 
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31,5.6.2 Interface controller 

Rea«QWn» no nul*^— 1 



word jdec • 1 
fecuupdaie • i 



ctfQr siantCfliflLjai] 




1. 



Idle 



req_updaia> l 



fro flfithmF-LAMOJluj 



>{ColorSeIect 



I -A y / colorjcnt-«-»' 



ca(orjDfU<*-» 



OR rftfi fwj1h-»1 



CQter enaMftfcotof cntl»»OAMP 



ctfflr trtartrcfltef cm)— I 



color.cntW 



^ Request ^ 



lu_diu_rrBq •i 



Machine remains in same state by default 
All outputs are zero unless othenwise stated 

State Description: 

Idle : Idle state wait for active request 

ColorSelect Select the color to update before 
requesting to DIU 

Request Request Issued wait for acknowledge 

DataO: Data word 0 transfer 

Datal: Data word 1 transfer 

Data2: Data word 2 transfer 

Data3: Data word 3 transfer 



acifjupdateii 



DataO 



)wrfte.«n«l 
WT_OUl - - 



'^olor.jKar1(C0(0f_An(J 



^ Datal ^ 



WLK^I " -cotofjrtSflfOQioccnQ 



du ly tfYtflt>— 1 on ffM fwg=l 



.8tafHcotor_cntJ 



-oalor_ptart(Golor«cni] 



Figure 232. Interface controtler state diagram 

The interface controller co-ordinates and issues requests for data transfers from DRAM. The state machine 
waits in Idle state until it is enabled by Ae LLU controller {llujen) and a request for data transfer is 
received from the write pointer block. 

When an active request is received (req^active equals 1) the state machine jumps to the ColorSelect state 
to determine which colors (color^cnt) in the group need a data transfer. A group is defined as all odd col- 
ors or all even colors. If the color isn't enabled (color^enable) the coimt just increments, and no data is 
transferred. If the color is enabled, the state machine takes one of two options, either a null data transfer or 
an actual data transfer from DRAM. A null data transfer writes zero data to the DIU buffer and does not 
issue a request to DRAM. 

The state machine determines if a null transfer is required by checking the color jstart signal for that color. 

If a null transfer is required the state machine doesn't need to issue a request to the DIU and so jumps 
directly to the data transfer states {DataO to DataS). The machine clocks through the 4 states each time 
writing a null 64-bit data word to the buffer. Once complete the state machine returns to the ColorSelect 
state to determine if further transfers are required. 

If the color _start is active then a data transfer is required. The state machine jumps to the Request state 
and issue a request to the DIU controller for DRAM access by setting llu_diu_rreq high. The DIU 
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responds by acknowledging the request (diujlu_rack equals 1) and then sending 4 64-bit words of data. 
The transition from Request to DataO state signals the address generator to update the address pointer 
(adr^update). The state machine clocks through DataO to DataS states each time writing the 64-bit data 
into the buffer selected by the req^sel bus. Once complete the state machine returns to the ColorSelect 
state to determine if further transfers are required 

When in the ColorSelect state and all data transfers for colors in that group have been serviced (i.e. when 
color^cnt is 6) the state machine will return to the Idle state. On transition it will update the word counter 
logic (word_dec) and enabled the request logic {reqjupdate). 

A reset or llu_go^ulse set to 1 will cause the state machine to jump directly to Idle. The controller will 
remain in Idle state until it is enabled by the LLU controller via the llu^en signal. This prevents the DIU 
attempting the fill the DIU buffers before the dot line store FIFO has filled over its threshold level, 

31.5.6.3 Coior activate 

The color activate logic maintains an absolute line count mdicating the line number cuzrently being pro- 
cessed by the LLU. The counter is reset when the Uu^ojmlse is 1 and incremented each time a line^rd 
pulse is received. The count value (line^cnt) is used to detcnnine when to start reading data for a color. 
The count is implemented as follows: 
if ( llu_go_pulse == 1> then 

line_cnt = 0 
elsif ( line_rd == 1) then 

line_cnt 

The color activate logic compares line count with the relative line value to determine when the LLU 
should start reading data from DRAM for a particular half color. It signals the interface controUcr block 
which colors are active for this dot line in a page (via ihc color^tart bus). It is used by the inter&ce con- 
troller to determine which DIU buffers require null data. 

Once the color^tart bit for a color is set it cannot be cleared in the normal page processing process. The 
bits must be reset by the CPU at the end of a page by transitioning the Co bit and causing a pulse on the 
llu_goj?ulse signal. 

Any color not enabled by the color^enable bus will never have its color^start bit set. 

for (i=0; i<12;i++) { 

if { llu_oo_pulse 1) then 

col^onCij a 0 
elsif ( color.enableCi % 6] 1 ) then 

col_onCi} « 0 
elsif { line_cnt «= color_rel_lineti] > then 
col__onCil = 1 

) 

// selec.t either odd or even colors 

if ( odd_even_sel «= 1 ) then // odd selected 

color_stertt5:0J = {col.on til) , col.onfS J ,col_on[7 J , col_on[Sl ,col_on(3) ,col_on(l) > 
else // even selected 

color_Btart(5:0J = (col_on tlO] . col.onCaj , col_on[6] , col_ont43 , col.on t2 J ,col_on(0] ) 



31,5.$,4 Address generator 

The address generator block maintains 12 pointers {color_adr[1 1 :0J) to DRAM corresponding to current 
read address in the dot line store for each half color. When a DRAM transfer occurs the address pointer is 
used first and then updated for the next transfer for the color. The pointer used is selected by the reqjsel 
bus, and the pointer update is initiated by the adrjupdate signal fh)m the interim controUer. 
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The pointer update and pointer initialization is dependent on the pointer position in a line and the line posi- 
tion in the FIFO. 

When a llu_go^pulse is received the pointers arc each initialized to the corresponding base address for that 
color (color_base_adr). For each word that is read from DRAM the pointer is incremented If the word is 
the last word in a line (last^wd equals 1) and the last line in the fifo (fifojsnd equals 1) then the address 
pointer is re-initialized to the base address value. The pointer is incremented for all other words. 

The address is calculated as follows: 
// reset to base address 
if < llu_go_pul8e == 1) then 

color_adrIll;OJ = colorJbase_adrCll : 01 121 : 5] 
elsif ( adr.update «== 1) then 

if (rea.sel NULL ) then 
//do nothing 

elsif ((fifo.end «= 1)AND (last.wd == 1>) then 

color.adr(req_sel] » color^base^adr (req„sel} [21 :S] 

else 

color_adr(rea.8el] // normal increment 

// select the address pointer 
llu_diu_radr « color_adr(rea_sel] 



31.5.6.5 Line pointer 

The line pointer logic coimts the number of dot data lines read from DRAM for each color. The counter 
value is used to signal the fifo wrap point to the address generator logic. A separate counter is maintained 
for each color. 

The end of a line can be determined when the address is updated (adrjupdate equal 1) and the word trans- 
ferred is the last word of a line (lasted equal 1). The line pointer that needs to be updated is selected by 
the req_fiel bus from the write pointer block. If the selected pointer is zero the counter is reset to the corre- 
sponding colorjifo^ize value^ otherwise the counter is decremented. 

If the llu^go^ulse signal is high the counters are reset to its corresponding color Jifo^size value. When 
the counter is zero it sets the fifo_end bit to signal the address generator that the fifo has wrapped (to 
update the address pointer accordingly). 

if (llu_go_pulse == 1) then 

line_pt[ll:0} = color_f If o_siie ( 11 : 0] 
elsif ((adrjupdate == 1) AND (last_wd »« 1)) then { 

if (linej)t (req^sel] == 0) 

.line_pt{rea_sel) « color.f i£o_8ize(req_sel] 

else 

1 ine_pt I req_sel 1 — 

> 

// select: the correct line pointer for comparison 
fifo_end = (line_pt (line_pt) == 0) 

31.5.e.6 Write pointer 

The write pointer logic maintains the buffer write address pointers* dctennines when the DIU buffers need 
a data transfer and signals when the DIU buffers are empty. The write pointer determines the address in the 
DIU buffer that the data should be transferred to. 

The write pointer logic compares the read and write pointers of each DIU buffer to determine which buff- 
ers require data to be transferred from DRAM (pendfiJ:OJ bus), and which buffers are empty (the 
bufjemp signals). Only enabled buffers are considered as indicated by the color_enable bus. 
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Buffers are grouped into odd and even buffers, if an odd buffer requires DRAM access the odd_pend siz- 
nals will be artiw, if an even buffer requires DRAM access the even_pend signals will be actiw If bofli 
odd and even buffer require DRAM access, the even buffers will get serviced first 
If any buffer requires a DRAM transfer, the logic will mdicate to the interface controUer via the rea active 
agnal. vnih rhc odd even jet signal detennining which group of buffere get serviced The interface con- 
ttrtler wil! check the color_enable signal and issue DRAM transfers for all enabled colors in a sroun 
When the transfers are complete it tells the write pointer logic to update the request pendi^^vSi 
req_update Signal. ^ e *« 

'^y^^ '^J^ ^^"^ ^drcss generator which buffer is being serviced, it is constructed from 
tije odd^even_sel signal and the color^cnt[2:0J bus from the interface controller. When data is being tians- 
ferred to DRAM the word pointer and write pointer for the corresponding buffer are updated. Th7reQ_sel 
determines which pointer should be incremented. ^ b up cu. i ne req^ei 

The write pointer logic operates the same way regardless of whether the transfer b null or not. 

// determine which buffers need updates 
£or{ i=0; i<12; i++) { 

// determine if re<xuest is active, filtered by color enable 
if ( wr_adrtiH3:2| == r<3L.adr ti J :2 J ) 

pend[il « 1 
else 

pendCil = 0 
// determine if any enabled buffer is empty 

if ((wr^adrtiJUrO) rd.adr [ij t3 :0J > AND (color.enableti / 2] 1)) then 
but__enpiij = 1 

} 

//Odd half colors (1,3,5,7,9,11), even half colors (0,2,4,6,8,10) 
oddj?end » < pendllj | pendC33 | pendfS) | pend{7] | pendt9) | penddU ) 
eyonj>end = ( pendfO) | pend(2J | pendt4) | pend(6} | pendI8] | pend[10) ) 
// fixed servxcing order, only update %irhen controller dictates so 
If (req_update »«= 1) then { 

if (even^end == 1) then // ©ven always first 

odd_even.8el = 0 

*^e«L-«ctive = 1 
elsif (odd_pend i ) then // then check odd 

oddL.even.ael a 0 

req^active = 1 

// nothing active 

odd^even^sel - 0 
req^active = 0 

} 

// selected requestor 

re<Lsel(3:0] = {color_cntt2 : OJ ,odd_even_sel> // concatentation 

The write address pointer logic consists of 12 l-bit counters and a word select pointer. The counters are 

reset when //u^^ The word pointer {word^tr) is common to all buffers and is used to write 

64-bit words mto the DIU buffer. It is incremented when buf_ni^en is active. If the word^tr is 3 and the 

flt^r ^^^^^f ^^^f* ^'^^^^ i^r_ptr[regj,el]) will be incremented. A concatenation of 

nnt ^TST^' !f . ""^ *° ^""^"^^^ ^"ff^"* a^ss. The write pointerrj are 

not reset at the end of each Ime. 



// determine which pointer to update 
if (buf_wr_en == 1) then t 

wr_adr(req_eel ] 

wr_enfreq_sel) = 1 

) 
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// determine which pointer to update 
if (llu_go_puase == 1) then 

wr_ptrtll;0] s 0 

wordjptr = 0 
elsif (buf_rd»en «== l) then { 

wordLptr** 

if {word_ptr 3 ) then 
wr_j>tr (re<L.8el ) ♦+ 

) 

// create the address from the write pointer and word pointer 
wr.adrCreo-sel) = {wr jtr [req^sel J , wordj>tr) // concatenation 



31.5.6.7 Word count 



The word count logic maintains 2 counters to track the number of words transfeired from DRAM per line 
one counter for odd data, and one counter for even. On receipt of a Uu^o^e, the counter are imtial- 
^KV^^''l''''-^r''^ ^""^^^ of words per line). When a group of words axe transfeired to 
DRAM as mdjcated by the word^dec signal from the interface controller, the corresponding counter is 
decranented. The counter to decrement is indicated by the odd^even^sel signal from the write pointer 
block (even = 0, odd = 1). « 

When a counter is zero the iasi_wd signal for that group fre. odd or even) is set. The !ast_wd signal indi- 
cates to Ac address generator that the next word transferred from DRAM for the corresponding color is the 
last word m the Ime. When the last word actually gets transferred the interface controller will pulse the 
word_dec signal causing the corresponding word count to reset to the color Jinejnc value. 

// detexnine which counter to decrexnent 
if (lIu^o_pul8e =a 1) then 

word_cnt(0J a color_line_inc // odd count 

wor<t_cntIlJ = color_line_inc // even count 
elsif =- 1) then ( // ^eed to decrement one word counter 

if <word_cnttodd_even.selj == 0) then // lino finish 

wor<l_cnt(odd..even^Bell = color_line_inc 

else 

word^cnt [od4_even_sel ) — 

> 

// select the correct the last_wd 
last_wd = (word.cnt(oddLeven_sel] «= 0) 

The word count logic also determines when a complete line has been read from DRAM, it then signals the 

f [""r !. ^ ^""^ f"^^ ft/ie_r^ signal) that a complete line has been lead by the 

LLU {llu_dwu_fine_rd). 

// line finish logic 

if (llu_go_pulse == 1) then 

line_fin = 0 

line_rd = 0 

elsif (<last_wd == 1) AND (line^fin == 0) AND (word^dec =« 1 ) ) then 
line_fin - 1 // tLT9t group last^wd finish pulse 

line_rd =0 

elsif ((last.wd =« 1) Airo (line.fin l) AMD (word^dec == 1 ) ) then 
line^fin =0 // second group last^wd finish pulse 

line_rd =1 

else 

line^fin = line^fin // stay the same 

line_rd = 0 



vSio^'?3^-*^'^^'^-''^''^" S3 Proprietary Document aftNov2002 
' Page 498 



SoPEC : Hardware Design 



32 PrintHead Interface (PHI) 



32.1 Overview' 



The Printhead interface (PHI) accepts dot data from the LLU and transmits the dot data to the prinlhcad, 
using the printhead interface mechanism. The PHI generates the control and timing signals necessaiy to 
load and drive the bi-lithic printhead. The CPU determines the line update rate to the printhead and adjusts 
the Une sync frequency to produce the maximum print speed to account for the printhead IC's size ratio 
and inherent latencies in the syncing system across multiple SoPECs. 

The PHI also needs to consider the order in which dot data is loaded in the printhead. This is dependent on 
the construction of the printhead and the relative sizes of printhead ICs used to create the printhead. See 
Bi-lithic Printhead Reference document for a complete description of printhead types [10]. 

The printing process is a real-time process. Once the printing process has started, the next Printline's data 
must be transferred to the printhead before the next line sync pulse is received by the printhead Otherwise 
the printing process will terminate with a buffer undcrrun error. 

The PHI can be configured to drive a single printhead IC with or without synchronization to other 
SoPECs. For example the PHI could drive, a single IC printhead (i.e. a printhead constucted with one IC 
only), or dual IC printhead with one SoPEC device driving each printhead IC 

The PHI interface provides a mechanism for the CPU to direcdy control the PHI interface pins, allowing 
the CPU to access the bi-lithic .printhead to: 

• deterinine printhead temperature 

• test for and determine dead nozzles for each printhead IC 

• initialize each printhead IC 

• pre-heat each printhead IC 

Figure 233 shows a high level data flow diagram of the PHI in context. 



SoPEC 



LLU 



PHI 



dot data 



test data, temoeratufa 



Temp data 



controt 



CPU 



dot data 



test data, tamoenature 




BMKhic Printhead 
Figure 233. High ievel data flow diagram of PHi in context 
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32.2 Printhead modes of operation 

The prinfliead has 4 different modes of operations (although some modes are re-used). The mode of oper- 
ation IS defined by the state of the output pinsphLlsyncl and phi_readl. As both printhead ICs are driven 
by the same signals both printhead ICs must be in the same mode of operation. The modes of operation are 
defined in Table 158. 



Table 1S8. Printhead modes of operation 





11^^ 
















1 


1 


N/A 


Normal print moda. dot data Is clocked Into the print- 
head shift register, on each falling edge of pN srctk 


DOr^LOAD/ 
FIREJNIT 


1 


0 


phLfrclkdO 


Dot Load Mode« data stoied in the dot shift register is 
transferred into the dot latch on the falling edge of 
phijsyncl, and latched in on the nsing edge of 
phUsynct 








phi_srcfk=\ 


Rre load inode. Parameter tor generating fire pattern 
are loaded into generator, data on phi_ph_<fatal1:0J[0J 
is docked into the generator on each rising edge of 
phi^Mk 


TEST_MODE 


0 


0 




Dot Load Mode, data stored in the dot shift register is 
transfen-ed into the dot register on the rising edge of 
phijsynci, identical to DOT.LOAO 










The printhead is in test mode, the temperature delta 
Sigma is clocked out of the printhead on the rising of 
frdk through phi_ptijfata[l:0][l] 
The result of tf>e nozzle test is clocked out of the print- 
head through phi^h^data(1:0][0] 


RREjGEN 


0 


1 


N/A 


The nozzle test circuit is reset 

CMOS testing mode, the dot shift register is scanned 

out of the printhead on the falling edge of phljstdk. 

Data is output on p/j/ _ph_<iata[1:0][1:0J 

The Initialised generator creates the fire pattern and 

shift select pattern, and the pattern is ck>cked into the 

fire shift register and select shift register on the rising 

edge of pNJMk 



32.3 Data rate equauzation 



The LLU can generate dot data at the rate of 12 bits per cycle, where a cycle is at the system clock fre- 
^^''L? *** '^^^^ ^ the printhead needs to print a line 

every 100|« (calculated from 300mm @ 65.2 dots/mm divided by 2 seconds =~ 100|isec) For a 7 3 con- 
structed pnnthead diis means that 9744 cycles at 106Mhz is quick enough to transfer the dot data. The 
input FIFOs are used to de-couple the read and write dock domains as well as provide for differences 
between consume and fill rates of the PHI and LLU. 

Nominally the system clock {pclk) is run at 160Mhz and the printhead interface clock (phiclk) is at 
lOoMhz. 

If the PHI was to transfer data at the full printhead interface rate, the transfer of data to the shorter print- 
head IC would be completed sooner Uian the longer printhead IC. While in itself this isn't an issue it 
requires that the LLU be able to supply data at the maximum rate for short duration, this requires uneven 
bursty access to DRAM which is undesirable. To smooth the LLU DRAM access requirements over time 
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the PHI transfers dot data to the printhead at a pre-programmed rate, propoitional to the ratio of the shorter 
to longer printhead ICs. 



WHhoul Rata equalization (7:3 head) 

phi Jsynd jj 
phi_ph_data(0II1:0] I - . " ... 



100 usee 



ph{_ph_data(1]{1:0] 
phL8rcllc[0] 

phLsrclkll)" 



With Rate equalization (7:3 head) 
phi Jaynd jj 

jzzn 



phijih_data[0][1:0]- 
phLph_data[1I(1:0] 

phLsrcU«(OJ 
pM.srdicIl]' 



F 



n 



n 



IT" 
_X3 



n 



Figure 235. Printhead data rate equalization 

The printhead data rate equalization is controlled by PrintHeadRate[i:0] registers (one per printhead IC). 
The register is a 16 bit bitmap of active cloclc cycles in a 16 clock cycle window. For example if the regis- 
ter is set to OxFFFF then the outpxit rate to the printhead will be full rate, if it*s set to OxFOFO then the out- 
put rate is 50% where there is 4 active cycles followed by 4 inactive cycles and so on. If the register was 
set to 0x0000 the rate would be 0%. The relative data transfer rate of the printhead can be varied from 0- 
100% wi& a granularity of 1/16 steps. 

Table 159. Example rate equalization values for common printheads 









8:2 


OxFFFF (100%) 


Oxlill (25%) 


7:3 


OxFFFF (100%) 


0x5551 (43.7%) 


6:4 


OxFFFF (100%) 


0xFlF2(68.7%) 


5:5 


OxFFFF (100%) 


OxFFFF (100%) 



If both printhead ICs are the same size (e.g. a 5:S printhead) it may be desirable to reduce the data rate to 
both printhead ICs, to reduce the read bandwidth from the DRAM. 
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32.4 Dot generate and transmit order 

Several printhead types and arrangements exists (see Section 35 Memjet Printhead) . The PHI is capable of 
driving all possible configurations, but for the purposes of simplicity only one airangement (arrangement 0 
* see Section 35 Memjet Printhead) is described in the following examples. 



OotTiansmit 
Order ~" 



1 



G H O O O " 



0 2 4 

O O Q Q- 



'O O O Q Q O O O ' 

o o o q > o o o o 



■ o o o o 



m-5 n>-3 n»-I 



ui^ 3 ni^5 



0-6 n-l 



■o o o p 



n-S A>3 n-l 



Type 0 printhead IC 



Type 1 printhead IC 
Paper 



I 



5 Lines 



Paper 
Dlractlon 



M • Mklway point in dots 
N - Number of dots in a line 



Nolei Pe^mt passing under pririttio&d 



Figure 236. Printhead structure and dot generate order 

The structure of the ptinthead ICs dictate the dot transmit order to each printhead IC. The PHI accepts two 
streams of dot data from the LLU, one even stream the other odd. The PHI constructs the dot transmit 
order streams from the dot generate order received from the LLU. Each stream of data has already been 
arranged in increasing or decreasing dot order sense by the DWU. The exact sense choice is dependent on 
the type of printhead ICs used to construct the printhead, but regardless of configuration the odd and even 
stream should be of opposing sense. 

The dot transmit order is shown in Figure 236. Dot data is shifted into the printhead in the direction of the 
arrow, so from the diagram (taking the type 0 printhead IC) even dot data is transferred in increasing order 

to the mid point first (0, 2, 4 m-6, m-4, m-2), then odd dot data in decreasing order is transferred (m-1, 

m-3, m-5,...., 5, 3, 1). For the type 1 printhead IC the order is reversed, with odd dots in increasing order 
transmitted first, followed by even dot data in decreasing order. Note for any given color the odd and even 
dot data transferred to the printhead ICs are from different dot lines, in the example in the diagram they are 
separated by 5 dot lines. Table 160 shows the transmit dot order for some common A4 printheads. Differ- 
ent type printheads may have the sense reversed and may have an odd before even transmit order or vice 
versa. 



Table 160. Example printhead tCs, and dot data transmit order for A4 (13824 dots) page 









^^mmmmmm 


Type 0 Printhead IC 


8 


11160 


0^,4,6 ^574^76.5578 


5579.5577,5575 7.5.3,1 


7 


9744 


0^.4.6 4666.4666,4670 


4871, 4869 ,4867....-7,5.3,1 


6 


6326 


0,2.4,8 4 1 58.41 60,41 62 


4163.4161.4159 7,5,3.1 


5. 


6912 


0,2,4,8 3450,3452,3454 


3455,3453.3451 7.5.3.1 


4 


5496 


0,2.4,8 ;2742.2744.2746 


2847.2845.2843 7.5.3,1 


3 


4080 


0.2.4.8 .2034.2036.2038 


2039.2037.2035 7.5.3,1 
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Table 1 60. Example prIntKead ICs, and dot data transmit onJer for A4 (1 3824 dots) page 











2 1 2664 1 0,2.4.8 1326,1328,1330 | 1331,1329,1327 7,53.1 


lype 1 Printhead IC 


8 


11160 


13823,13821.13819 1337.1335.1333 


1332.1334.1336 13818.13820.13822 


7 


9744 


13823,13821,13819 2045,»)43.2041 


2040^2.2044 13818.13820.13822 


6 


8328 


13823.13821,13819 2853.2851,2849 


2848.2850.2852 13818.13820.13822 


5 


6912 


13823.13821.13819 ..«...3461. 3459.3457 


3456.3458.3460 13818.13820.13822 


4 


5496 


13823.13821.13819 4169.4167.4165 


4164,4166,4168 13816,13820.13822 


3 


4080 


13823,13821.13819 4877.4875,4873 


4872.4874.4876 1 381 8. 1 3820.1 3822 


2 


2664 


13823.13821.13819 5585.5583.5561 


5580.5582.5584 1 381 8. 1 3820. 1 3822 



32.4.1 Dual Printhead IC 

Generate dot order {from the LLU) 



Odd Dot stream 



Even Dot stream |fg^^ 





















6912 dock cydes 


PI 



Transmit dot order(to the printhead) 
Printhead Channet A 
Printhead Channel B 



Mid 

Pofnt 



4872 dock cydes 



«W clock 



Even dots from Line Y 
Odd dots from UneY-5 



9744 dock cydos 



Example: Une wfth 1 3624 dots, imth 7:3 printhead 
Figure 237. Dot data generated and transmitted order 



The LLU contains 2 dot generator units. Each dot generator reads dot data from DEIAM and generates a 
stream of dots in increasing or decreasing order. A dot generator can be configured to produce odd or even 
dot data streams, and the dot sense is also configurable. In Figure 237 the odd dot generator is configured 
to produce odd dot data in decreasing order and the even dot generator produces dot data in increasing 
order. 

In order to reconstruct the dot data streams from the generate order to the transmit order, the connection 
between the generators and transmitters needs to be switched at the mid point. At line start the odd dot 
generator feeds the type 1 printhead, and the even dot generator feeds the typo 0 printhead. This continues 
until both printheads have received half the number of dots they require (defined as the mid point). The 
mid point is calculated from the configured printhead size registers (PrintHeadSize), Once both printheads 
have reached the mid point, the PHI switches the connections between the dot generators and the print- 
head, so now the odd dot generator feeds the type 0 printhead and the even dot generator feeds the type 1 
printhead. This continues until the end of the line. 
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It is possible that both printheads will not be the same size and as a result one dot generator may reach the 
mid point before the other. In such cases the quicker dot generator is stalled until both dot genetators reach 
the mid point, the connections are switched and both dot generators are restarted 

Note that in the example shown in Figure 237 tfie dot generators could generate an A4 line of data in 6912 
cycles, but because of the mismatch in the printhead IC sizes the transmit time takes 9744 cycles. 

I 32.4.2 Single printhead IC 

In some cases only one printhead IC may be connected to the PHI. In Figure 238 the dot generate and 
I transmit order is shown for a single IC printhead of 9744 dots width. While the example shows the print- 

head IC connected to channel A, cither channel could be used. The LLU generates odd and even dot 
streams as normal, it has no knowledge of the physical printhead configuration. The PHI is configured 
I with the printhead size (JPrintHeadSize[l] register) for channel B set to zero and channel A is set to 9744. 

Generate dot order (from the LLU) 

Odd Dot stream 



Even Dot stream 

















N .^^.^ H 



Transmit dot order(to the printhead) pj^^t 



Printhead Channel A 








Printhead Channel B 


< 








^ 


4572 dock c>^s clock cydes 


► 






9744 dock cycles 


► 



1^ 



Even dots from Une Y 

Odd dots from Urte Y-5 Example: Une with 9744 dots, with 730 printhead 

Figure 238. Dot data generated and transmitted order (single printhead ease) 

Note that in the example shown in Figure 238 the dot generators could generate an 7 inch line of <lata in 
4872 cycles, but because the printhead is using one IC, the transmit time takes 9744 cycles, the same speed 
as an A4 line with a 7:3 printhead. 

32.4.3 Summary of generate iand transmit order requirements 

In order to support all the possible printhead arrangements, the PHI (in conjuction with the LLU/DWU) 
must be capable of re-ordering the bits according to the following criteria: 

• Be able to output the even or odd plane first 

• Be able to output even and odd planes independently. 

• Be able to reverse the sequence in which the color planes of a single dot are output to the printhead. 
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32.5 Print sequence 

The PHI is responsible for accepting dot data streams from the LLU, restructuring the dot data sequence 
and transferring the dot data to each printhead within a line time (i.e before the next line sync). 

Before a page can be printed the printhead ICs must be initialized. The exact initialization sequence is con- 
figuration dependent, but will involve the fire pattern generation initialization and other optional steps. The 
initialization sequence is implemented in software. 

Once the first line of data has been transferred to the printhead, the PHI will intemipt the CPU by asserting 
iht phi_icu_print_rdy signal. The interrupt can be optionally masked in the ICU and the CPU can poll the 
signal via the PCU or the ICU. The CPU must wait for a print ready signal in all printing SoPECs before 
starting printing. 

Once tiie CPU in the PrintMaster SoPEC is satisfied that printing should start, it triggers the LineSync- 
Master SoPEC by writing to the PrintStart register of all printing SoPECs. The transition of the PrintStart 
register in the LineSyncMaster SoPEC will trigger the start oflsyncl pulse generation. The PrintMaster 
and LineSyncMaster SoPEC are not necessarily the same device, but often are the same. For a more in 
depth definition see section 12.3 Multi-SoPEC systems on page 104. 

Writing to the PrintStart register generates a pulse which is used to generate the line sync in the LineSyn- 
cMaster which is in turn used to align all SoPECs in a multi-SoPEC system. All prindiead signaling is 
aligned to the line sync. The PrintStart is only used to align the first line sync in a page. 

When a SoPEC receives a line sync pulse it means that the line previously transferred to the printhead is 
now printing, so the PHI can begin to transfer the next line of data to the printhead. When the transfer is 
complete the PHI will wait for the next line sync pulse before repeating the cycle. If a line sync arrives 
before a complete line is transferred to the printhead (i.e. a buffer error) the PHI generates a buffer under- 
run interrupt, and halts the bloclc 

For each line in a page the PHI must transfer a full line of data to the printhead before the next line sync is 
generated or received. 

32.5.1 Sync pulse control 

If the PHI is configured as the LineSyncMaster SoPEC it will start generating line sync signals LsyncPre 
number of phiclk cycles after PrintStart register rising transition is detected. All other signals in the PHI 
interface are referenced from the falling edge ofphijisyncl signal. 

If the SoPEC is in line sync slave mode it will receive a line sync pulse from the LineSyncMaster SoPEC 
through ihephi_isyncl pin which will be programmed into input mode. The phijsyncl input pin is treated 
as an asynchronous input and is passed through a de-glitch circuit of programmable de-glitch duration 
(LsyncDeglitchCni), 

The phijsyncl will remain low for LsyncLow cycles, and then high for LsyncHigh cycles. The phijsyncl 
profile is repeated until the page is complete. The period of the phijsyncl is given by LsyncLow + Lsyn- 
cHigh cycles. Note that the LsyncPre value is only used to vary the time between the generation ofihe first 
phijsyncl and the PageStart indication from the CPU. See Figure 239 for reference diagram. 

If the SoPEC device is in line sync slave mode, the LsyncMinPeriod register specifics the minimum 
allowed phijsyncl period. Any phijsyncl pulses received before the LsyncMinPeriod has expired will 
trigger a bufTer undemm error. 

32.5.2 Shift register signal control 

Once the PHI receives the line sync pulse, the sequence of data transfer to the printhead begins. All PHI 
control signals are specified from the falling edge of the line sync. 
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The phi_srclk (and consequently phijphjtata) is controlled by the SrvikPre, SrclkPost registers. The 
SrdkPre specifies the number of phiclk cycles to wait before beginning to transfer data to the printhead. 
Once data transfer has started, the profile of the phi^srclk is controlled by PrintHeadRate register and the 
status of the PHI input FIFO, For example it is possible that the input FIFO could empty and no data 
would be transferred to the printhead while the PHI was waiting. After all the data for a printhead is trans- 
ferred to the PHI, it counts SrclkPost number of phiclk cycles. If a new phijsyncl falling edge arrives 
before the count is complete the PHI will generate a buffer undermn interrupt Jphijcu^underrun). 



32.5.3 Firing sequence signal control 



PrihJSlarfEdge 



The profile of the phijrclk pulses per line is determined by 4 registers FrclkPre, FrclkLow, FrclkHigh, 
FrclkNum. The FrclkPre register specifies the number of cycles between line sync felling edge and the 
phijrclk pulse high. It remains hi^ for FrclkHigh cycles and tiicn low for FrclkLow cycles. The number 
of pulses generated per line is determined by FrclkNum register. 

Tbs phi^rqfile pin is specified in a similar manner by the PrvfilePre, PrqftleLow, ProfileHigh^ PrvfileNum 
registers. 

The phijrclk period and thcphi^rqfile period should be programmed the same, so FrclkHigh + FrclkLow 
should equal the ProfileHigh + PrvfileLow, and the number of cycles for each in a line time should also be 
equal i.e. FrclkNum = ProfiLeNum. 

The total number of cycles required to complete a firing sequence should be less than the phijlsyncl period 
i.e. {{ProfileHigh + ProfileLow) * PrvfileNum)'^ PrqfilePre < {LsyncLow + LsyncHigh). 
^ LgyncPre 

r i 



LsyncPertod 



phi_lsynci 



LsyncHigh 



1_ 



phljsicOt 



pftLph_data 



phi.frdk 



phLprofile^ 



.SfOkPre 



^ ScdkPQSt ^ 



FfdkPre 



FrclkHigh FrclkLow 



FrclkHigh 



J — L 



ProfilePie 



ProfileHigh 



J L 



ProfileLow 
< 



Figure 239. Printhead interface timing parameters 



Figure 239 details the timing parameters controlling the PHI. All timing parameters are measured in num- 
ber of phiclk cycles. 



32.5.4 Page complete 

The PHI cotmts the number of lines processed through the interface. The line count is initialised to the 
PageLenLine and decrements each time a line is processed. When the line count is zero it pulses the 
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phijcujffagejinish signal. A pulse on the phijcu^pagejinish automatically resets the PHI Go register, 
and can optionally cause an interrupt to the CPU. Should the page teizninate abnormally, i.e. a buffer 
underrun, the Go register will be reset and an interrupt generated. 



The PHI will generate an internet to the CPU after a predefined number of line syncs have occured. The 
number of line syncs to count is configured by the LineSyncInterrupt register. . The interrupt can be dis- 
abled by setting the register to zero. 



The PHI block allows the generation of margins either side of the received page from the LLU block. This 
allows the page width used wi^ PEP blocks to differ from the physical printhead size. 

This allows SoPEC to store data for a page minus the margins, resulting in less storage requirements in the 
shared DRAM and reduced memory bandwidth requirements. The difference between the dot data line 
size and the line length generated by the PHI is the dot line margin length. There are two margins specified 
for any sheet, a margin per prin&ead IC side. 

The margin value is set by programming the DotMargin register per printhead IC. It should be noted that 
the DotMargin register represents half the width of the actual margin (either left or right margin depending 
on paper flow direction). For example, if the margin in dots is 1 inch (1600 dots), then DotMargin should 
be set to 800. The reason for this is that the PHI only supports margin creation cases 1 and 3 described 
below. 



32.5.5 Line sync interrupt 



32.6 



Dot line margin 
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See example in Figure 240. 



Margin 

(200€iOt8} 



Print area(4772 dots) 



Une Y-S 



HneY 




Dlitecstian 



Isynd LT 



UUdata. 
phLsrcOc' 

Case2 



aUdata. 
phLph.data ^ 
phLsrdk' 

Cases 

tXUdata- 



phLptudata^ 
pftCsrcOc' 



8544 dots 



100 do^ 



Figure 240. Printhoad timing with mangining 

In the example the margin for the type 0 printhead IC is set at 100 dots {DotAfar^n^lOO), implying an 
actual margin of 200 dots. 

If case one is used the PHI takes a total of 9744 phi^rclk cycles to load the dot data into the type 0 print- 
head. It also requires 9744 dots of data from the LLU which in turn gets read from the DRAM, In this case 
the first 100 and last 100 dots would be zero but arc processed though the SoPEC system consuming mem- 
oiy and DRAM bandwidth at each step. 

In case 2 the LLU no longer generates the margin dots, the PHI generates the zeroed out dots for the mar- 
gining. The phijsrclk still needs to toggle 9744 times per line, although the LLU only needs to generate 
9544 dots giving the reduction in DRAM storage and associated bandwidth. The case 2 senario is not sup- 
ported by the PHI because the same effect can be supported by means of case 1 and case 3. 

If case 3 is used the benefits of case 2 are achieved, but the phijsrclk no longer needs to toggle the full 
9744 clock cycles. The phi_srclk cycles count can be reduced by the margin amount (in this case 9744- 
100=9644 dots), and due to the reduction in phijsrclk cycles the phijsyncl period could also be reduced, 
increasing the line processing rate and consequently increasing print speed Case 3 works by shifting the 
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odd (or even) dots of a maigin from line Y to become the even (or odd) dots of the margin Y-4, (Y-5 
adjusted due to being printed one line later). This works for all lines with the exception of the fiist line 
where there has been no previous line to generate the zeroed out margin. This situation is handled by add- 
ing the line reset sequence to the printhead initialization procedure, and is repeated between pages of a 
document. See section 32.8.3 on page 512. 

32.7 Dot counter 

For each color the PHI keeps a dot usage count for each of the color planes (called AccumDotCount). If a 
dot is used in particular color plane the corresponding counter is incremented. Each counter is 32 bits wide 
and saturates if not reset. A write to the DotCouniSnap register causes the AccumDotCouiu/NJ values to 
be transferred to the DotCount[NJ registers (where N is 5 to 0, one per color). The AccumDotCount regis- 
ters are cleared on value transfer. 

The DotCount[N] registers can be written to or read from by the CPU at any time. On reset the counters 
are reset to zero. 

The dot counter only count dots that are passed from the LLU through ther PHI to the printhead. Any dots 
generated by direct CPU control of the PHI pins will not be counted. 

32.8 CPU lO CONTROL 

The PHI interface provides a mechanism for the CPU to directly control the PHI interface pins, allowing 
the CPU to access the bi-lithic printhead; 

• Determine printhead temperature 

• Test for and determine dead nozzles for each printhead IC 

• Printhead IC initialization 

• Printhead pre-heat function 

The CPU can gain direct control of the printhead interface connections by setting the PrintHeadCpuCtrl 
register to one. Once enabled the printhead bits are driven directly by the PrintHeadCpuOut control regis- 
ter, where the values in the register are reflected directly on the printhead pins and the status of the print- 
head input pins can be read directly from the PrintHeadCpuIn. The direction of pins is controlled by 
programming PrintHeadCpuDir register. The register to pin mapping is as follows: 



Table 161. CPU control and status registers mapping to printhead Interface 





SMI ^'if!-S'P^iil^Wfil 


PrintHeadCpuOut 


1:0 


phi_ph.datajol0K1K)] 


3:2 


phi j)h_data_o[1 JI1 :0] 


4 


phLlsynd_o 


5 


phLreadl 




phLsrclkllK)] 


8 


phLfrdk 


9 


phi^profile 
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Table 161. CPU control and status registers mapping to prlnthead Interface 







pnntHeadCpuDIr 


1:0 


phjj>h.data_e(0]I1:0] direction control. 
1 - output mode 
0 ' input mode 


3:2 


pHI_ph_data_eIini.-OJ direction control 
1 - output mode 
0 - input mode 


4 


phi.lsynd.e direction control 
1 • output mode 
0 - input mode 


PrinlHeadCpufn 


i.-o 


phij>h.data^l[0][1:0J 


3:2 


phl_phjdataj[l 111:01 


4 


phl^teynctj 



. ^ ^ ^u^^ ^ r^fM^^c^uu^pwwri moae u is me responsiDUity oJ me CPU to drive the 

pnnthead coirectly and not create situations where the printhead could be destroyed such as activating all 
nozzles together. 

Note the foUowing procedures are based on currwit printiiead capabilities, and arc subject to change. 

32.8.1 Dead nozzle information capture 

The CPU (via the direct printhead control mechanism) has the capability of testing each of the nozzles in 
the pnnthead and determining which nozzles are dead, the resultant dead nozzle infonnation is processed 
by the CPU to generate the dead nozzle table used by the DNC. 

32.8. f.f Nozzie test procedure 

The nozzle test software must first initialize the fire pattern generator for each printhead IC as normal, then 
It must mitialize the fire pattern register as normal. The fire pattern generator parameters must be chosen 
so as to create a fire pattern where only one nozzle is firing at a time. 

For example if the prinAead is constructed with a 7:3 configuration where the left printhead is 7 inches 
and the right 3 inches. The fire pattem length is equal to the number of dots in a half line (NLEN*=n- 
l.where n = 9744 / 2 = 4872), the COUNT=l and B=0. The fire generator in the printhead needs to be ini- 
tialized with NLEN=4871, COUNT^l, B=0. See Section 32.8.4 for exact details on how to program the 
fire pattem generator. 

Once the generator is setup the nozzle test software puts the printhead into FIRE_GEN mode and the fire 
pattern is loaded into the fire shift registers. 

The next step is to load the dot data shift registers >vith a test pattem. Any test pattem could be used it 
should be chosen so as to allow only one color to fire at a time. Once the printhead shift registers are ini- 
tialized the software can begin the nozzle test sequence. 

The printhead is put in FIRE.GEN mode which resets die test circuit, both phi^srclk and phi Jrclk are held 
macdve. After a pre-detennined time.the printhead is put in TEST_MODE where the nozzle is tested. 
The test software toggles phi^n>file output pin and then samples the test result on the phi ^h_data pin. 
The test software then generates one phi Jrclk pulse to advance the fire pattem and repeats the profile 
pulse and test result capture as before. This procedure is repeated for all dots in the half dot line. Once the 
test result for a particular dot line is complete the whole procedure is repeated 1 2 times once for each half 
dot line. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



-» Nov 2002 
Page 510 



SoPEC : Hardware Design 



The dead nozzle software collates all the nozzles test results and produces the dead nozzle table for use by 
the DNC. 



^ RRE r>irr fire gen 



NORMAL 



phi_Isyn<^ 
phi_readi 



^ ^RE GEN ^ fEST MOD^ ^RE_GEN ^ j pEST_MOgE 

L i. 



1 



J 



phl_sren( 



Jl_. 



pW_ph_datal 01 l-^g^ 



are init data 



Test pattern Data 



Nozzle test resUt 



Test Repeated Nozzle times 



Figure 241, Nozzfe Test Modes & Setup 



32.8.2 Temperature capture 



Occasionally the CPU will need to sample the printhead temperature and possibly adjust the firing profile 
based on the result. 

To capture the printhead temperature, the printhead must be put into TEST_MODE, and the 
phi^hjiataj pin input mode. The CPU will toggle the phijrcik and then sample the phi _j>h_dataji to 
capture the temperature data. The cycle is repeated N times, and the N bits of data are used to generate the 
printhead temperature value. The tenr^ierature capture waveform is shown in Figure 242. 

The exact number of bits required (i.e. N) and the temperature value generation mechanism is currently 
undefined. 



phUsynd " 
phLreadl ' 

pM.frdk 



TEST_MODE 



ClockO Clock 1 

J l__J 



Clock N 



:.-.J L 
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Figure 242. Temperature Capture Waveform 
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32.8.3 Printhead initialization procedure 

In order to use the printhead for the first time the CPU must download parameters for controlling the fire 
pattern generator. The download is performed by entering the FIRE^INIT mode and data is transferred 
through the p/ii:^A_^/affl/7;(?//Z?/ pins (one pin per printhead IC) and clocked into the printhead on the ris- 
ing edge of phi Jrclk. In total 29 clock cycles are required to transfer the fiiil set of parameters. 



Table 162. Parameters for Fire Pattern InlUallzatlon 









NL£N 


14 


Rre pattern length. Values defines the length of the fire pat- 
tern, NLEN=N-1 where N Is the pattern length. 


COUNT 


14 


Defines the remaining number of dock cycles required to 
generate the Rre Pattern. Is given by COUNT= (La/2) Mod 
N -1 where Is the dot length of longer printhead or 
COUrfTsr (La - -((Lt, /2) mod N)) Mod N -1 tor the shorter 
printhead 


B 


1 


Select shift register inversion bit. 



Once the generator is initialized the fire pattern and select pattern need to be created and shifted into their 
respective shift registers. The printheads are put into FIRE_GEN mode and the phi Jrclk is toggled L^^ 
times, where is the length of the longer printhead in dots. As phi Jrclk is a common signal for both 
printheads it means that if the printhead IC^s are of diffetent length one printhead IC will get clocked too 
many times by phi Jrclk. The fire pattern generator internal in each printhead IC takes account of diis. See 
Section 32.8.4 Fire pattern generator. 

If dot line margining is to be used the dot data registers in the maigining region in the printhead IC need to 
be initialized to zero before any line is printed. See section 32.6 on page 507 for a fidl explanation of dot 
line margin setup. The CPU does this by entering NORMAL3iODE and fills the dot data shift register 
with zeros. This is performed by clocking the phi^srclk to each printhead dot margin times for ttie each 
printhead IC. As phijsrclk is not conunon to both printhead ICs the number of clock cycles can be differed 
to each printhead IC. 

Once the printhead initialization is complete control of the printhead can be released to the PHI to allow 
printing to begin. 

32.8.4 Fire pattern generator 

The fire pattern generator is logic within each printhead IC used to generate the fire pattem and the select 
shift pattern. The fire pattem generator must be initialized by the SoPEC device before a page can be 
printed. The SoPEC uses the CPU direct lO control of the printhead pins to download the initialization 
parameters and generate the initialization sequence. 
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32.9 Implementation 

32.9.1 Definitions of I/O 

Table 163. Printhead rnterface I/O definition 



Clocks and Resets 



pdk 




In 


System Clock 


phiclk 




In 


Printhead intertace dock {doclkf3) used to transfer data from pdk to 
docZ/c domains 


docik 




in 


Data out dock (2x pcA) used to transfer data to printhead 


pfst_n 




In 


System reset, synchronous active low. Synchronous to pdk 


phirst_n 




In 


System reset, synchronous active low. Synchronous to phlcffc 


dorst^n 




In 


System reset, synchronous adh/e kiw. Synchronous to dodk 


General 


phijcu_print_rdy 




Out 


Indicates that the first line of data Is transferred to the printhead 
Acth/o high. 


phi_jcu_page_finlsh 




Out 


Indicates that data for a complete page has transfen-ed. Active high 


phLteujunderrun 




Out 


Indicates the PHI has detected a buffer underrun. Active high 


phLicuJInesyncint 




Out 


Indicates the PHI has detected One5|ync/ntom<pf number of line 
syncs. 


Debug 


del3UQ_data_outt2.-0] 


3 


In 


Output debug data to be rtiuxed on to the PHI pins 


de!xjg_cnti1[2:0) 


3 


In 


Control signal for each PHI bound debug data line Indicating 
whether or not the debug data should be selected ty the pin mux 


LLU Interface 


Uu j)hijdata(1 .-0][5:0l 


2x6 


Out 


Dot Data from LLU to the PHI, each bit is a color plane 5 downto 0. 
Bus 0 - Even dot data stream 
Bus 1 - Odd dot data stream 

Data is active when corresponding bit is acth/e in Hu_phLsivaUbusi 


phLUu.readyflrO] 


2 


In 


Indicates that PHI ts ready to accept data from the LLU 

0 • Even dot data stream 

1 - Odd dot data stream 


Ku_phLavailil:Oj 


2 


Out 


Indicates valid data present on corresponding //u^prtLdata, 

0 - Even dot data stream 

1 - Odd dot data stream 


Printhead Interface 


PW J>h_dalaJ(1 :0](1 .0] 


2x2 


(n 


Dot data input from printhead. 
Bus 0 - Printhead channel A 
Bus 1 - Printhead channel B 


phl^h_data_o(1 :0j[1 :0] 


2x2 


Out 


Dot data output to printhead. Each bus to each printhead contains 2 
bits of data 

Bus 0 - Printhead channel A 
Bus 1 - Printhead channel 8 


phi j)h_data_e(1 :0J[1 :01 


2x2 


Out 


Dot data direction control. Pin is driving when high 
Bus 0 - Printhead channel A 
Bus 1 - Printhead channel B 


phi_srcfk[1:0J 


2 


Out 


Dot data shift dock used to dock In printhead data 
Bus 0 - Printhead channel A 
Bus 1 - Printhead channel B 
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Table 163. Prlnthead Interface I/O definition 







phLreadt 


1 


Out 


Common prlnthead mode control. Used in conjunction with 
phUsynd io determine the printhead mode 

0 - SoPEC receiving, prlnthead driving 

1 - SoPEC driving, prlnthead receiving 


phLfrdk 


1 


Out 


Common Rre pattern clod< needs to toggle once per fire cycle 


phi_j>rofife 


1 


Out 


Common pulse profile for all colors 


phl_isynd_o 


1 


Out 


Capture dot data for next print line, output mode 


phijsynd_e 


1 


In 


p/iLteync/output enafcile. when high phijsynd pin is driving 


phijsynd^i 


1 


(n 


Une Sync Pulse from Master SoPEC 


PCU Interface 


pcujihLsel 


1 


In 


Block select from the PCU. When pcu_phLseI ls high both pcu^adr 
and pcu_jdataoiJt&Te valW. 


pcu_rwn 


1 


tn 


Common read/not-write signal from the PCU. 


pou.adr(7:2] 


6 


In 


PCU address bus. Only 6 bits are required to decode the address 
space for this block. 


pcu_dataout[31:0] 


32 


In 


Shared write data bus from the PCU. 


phi-pcu_rdy 


1 


Out 1 


Ready signal to the PCU. When ph/_pcu_nfy \3 high it indicates the 
last cyde of the access. For a write cyde this means pcu^dataout 
has been registered by the block and for a read cycle this means 
the data on phi_paj_jSata is valid. 


phi_pcu_data(31:0] 


32 


Out 


Read data bus to the PCU. 
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S5 



32.9.2 PHI sub-brock partition 




I pdk domain (160 Mhi> 



I dodk domain (320 Mhz) 



1^ j phiclk domain (106 Mhz) 



Figure 243. PHI blocic partition 



32.9.3 Configuration registers 
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The configuration registers in the PHI are programmed via the PCU interface. Refer to section 21.8.2 on 
page 257 for a description of the protocol and timing diagrams for reading and writing registers in the PHI. 
Note that since addresses in SoPEC are byte aligned and the PCXJ only supports 32-bit register reads and 
writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the PHI. 
When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of pki^cujtata. Table 164 lists the configuration registers in the PHI 



Table 164. PHf registers description 





wmmmmmm 


Control Reg 




0x00 


Reset 


1 


0x1 


Aclfve low synchronous reset, aeff de-ecth/ating. A 
write to tfiis register %vfO cause a PHI Wock reset 


0x04 


Co 


1 


0x0 


Active hjgh bit Indicating the PHI is programmed 
and ready to use. A low to high transition wOJ cause 
PHI biock Internal state to reset. Will be automatic 
caily reset if a page finish or a buffer undemin Is 
detected. 


General Con 


trol 


0x08 


PsageLenUne 


32 


0x0000 
JOOOO 


Specifies the number of dot lines in a page. 


OxOc 


PrfntSlart 


1 


0x0 


A low to high transition triggers printing to start 
Only active In Master Mode 


0x10^14 


DotMargfn 


2x16 


0x0000 


Specifies for each printhead IC, the widtti of the 
margin in dots divided by 2. 

0 - Printhead IC Channel A 

1 • Printhead IC Channel B 


0x1Mx2C 


DotCount(5.'0] 


6x32 


0x0000 

,0000 


Indicates the number of Dots used for a particular 
color, where N specifies a color from 0 to 5. Value 
valid after a write access to DotCountSnap 


0x30 


DotCountSnap 


1 


0x0 


Write access causes the ^cct/mDofCom rvalues to 
be transferred to the OofCotinr registers. The 
AccumDotCount aro reset afterwards. 


0x34 


PhiHeadSwap 


1 


0x0 


Controls which signals are connected to printhead 
channels A and B 

0 - Mormal. specifies bit 0 is channel A, bitl Is 
channel B 

1 - Swapped, specifies bit 0 is channel 8, bit 1 is 
channel A. 


0x36 


Phih/lode 


1 


0x0 


Indicates whether the PHI Is operating in master or 
slave mode 

0 - Slave Mode 

1 - Master Mode 


0x3C-0x40 


PhiSerfalOrder 


2x1 


0x0 


Spedfies the serialization order of dots before 

transfer to the printhead. 

Bus 0 - Printhead Channel A 

Bus 1 - Printhead Channel B 

A 0 indicates order ABC. while 1 indicates C6A 
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Table 164. PHI registers description 







m 


m 




0x44-0x48 


PrintHeadSize 


2x16 


0x0000 


specifies ttte number of non-margin dots in tfie 

printhead ICs. If margining is to be used then the 

configured PrinthteadSize should be adjusted by the 

dot margin value i.e. PrintHeadSize « {Physical- 

PrintHeadSize - {DotMargin * 2)). 

Bus 0 - Specifies printhead on Channel A 

Bus 1 - Specifies printhead on Channel B 


CPU Direct PHI Control (See Table 161.) 


0x40 


PrIntHeadOpuIn 


5 


0x00 


PHI interface pins input status. Only active In direct 
CPU mode 


0x50 


PrintHeadOpuDir 


5 


0x00 


PHI Interface pins direction control. Only active in 
direct CPU mode 


0x54 


PrintHeadCpuOut 


10 


0x000 


PHt interface pins output control. Only active In 
direct CPU nv>de 


0x58 


PrintHeadCpuCtrl 


1 


0x0 


Control direct access CPU access to the PHI pins 

0 - Normal Mode 

1 - Direct CPU Control mode 


Line Sync Control 


OxSC 


LsyncLow 


16 


0x0000 


Number of p/i;offlr cycles phLlsynd should remain 
low. 


0x60 


LsyncHfgh 


16 


0x0000 


Number of pftfc//r cycles ^_§syncl should remain 
high. 


0x64 


LayncPre 


16 


0x0000 


Number of p/i/c«r cycles between P/wi^Starf rising 
transition and the generated phi^lsyncHaXdng edge 


0x68 


Lsyr>cMjnPeriod 


24 


0x00.0 
000 


Minimum number of p/iiScsffir cycles between Lsync 
pulses. Lsync pulses of a shorter period will be 
rejected. Only used In stave mode. 


0x6C 


LsyncOeglHchCnt 


4 


0x3 


Number of phtdk cycles to filter the inconting Lsync 
pulse from the master. Only used In slave mode. 


0x70 


LmeSynclnterrupt 


16 


0x0000 


Number of tine syncs to occur before generating an 
interrupt. When set to zero Interrupt is disatiled. 


Slilft Register Control 


0x74 


SfcUcPre 


14 


0x0000 


Numl>er of phicfkcydes between p/iL^ync/ falling 
edge and phLsrdk pulse generation, or printhead 
data transfer 


0x78 


SndkPost 


14 


0x0000 


Number of pNdk cycles allowed margin from last 
snoffr pulse m a line to before next line sync 


0x70-0x80 


PrintKeadRate[l.*0] 


2x16 


OxFFFF 


Specifies the active to inactive ratio of phLsidk for 
the printhead ICs. A 1 1ndicates Active. 
Bus 0 - Printhead 10 charmel A 
Bus 1 - Printhead IC channel B 


0x84 


DotOrderMode 


1 


0x0 


Specifies the dot transmit order to the printhead 
Channel A. Printhead Channel B is always the 
opposing order. 

0 - Even before Odd dots 

1 - Odd before Even dots 


Fire Control 


0x88 


Profile Pre 


14 


0x0000 


Number of ptUclk cydes pNJsynct falling edge and 
phl_proliie putse generation 
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Table 164. PHI registers description 





^^SSlRedSt^^ Swag 






0x80 


PfofiJeLow 


14 


0x0000 


Number of phidk cycles phCpmfile should remain 
k>w. 


0x90 


ProfileHJQh 


14 


0x0000 


Number of phidk cycles phi j)rof\le should remain 
high. 


0x94 


PiofileNum 


16 


0x0000 


Number of profile pulses per One time. 


0x98 


FrdkPre 


14 


0x0000 


Number of phidk cycHes p/iL/sync/ falling edge and 
phLtircikjHMse generation 


0x90 


Frcfldjow 


14 


0x0000 


Number of phidkcydiea phLMK should remain 
low. 


OxAO 


FrdkHigh 


14 


0x0000 ' 


Number of p/i/cffr cycles phUMk should remain 
high. 


OxA4 


FidkNufn 


16 


0x0000 


Number of phfjMk pulses per line time. 


Wbrking Reg 


[slers 


0xA8-0xAC 


UneDotCnt 


2x16 


0x0000 


indicates the number of dot processed in the cur- 
rent line 

Bus 0 • Printhead Channel A 
Bus 1 - Printfiead Channel B 
(Read Only Registers) 


OxBO 


UneOnt 


32 


0x0000 
_0000 


Indicates the number of fines processed in this page 
(Read Only Register) 



The configuration registers m the PHI block are clocked at pclk rates but several blocks in the PHI are 
clocked by different and asynchronous clocks. Configuration values are not re-synchionized, it is therefore 
important that the Go register be set to zero while updating configuration values. This prevents logic from 
entering unknown states due to metastable clock domain transfers. 

Some registers can be written to at any time such as the direct CPU control register (PrintHeadCpuIn, 
PrintHeadCpuDir^ PrintHeadCpuOut and PrintHeadCpuCtrl)^ the Go register and the PrintStart register. 
All roisters can be read from at any time. 

When one of the direct CPU control registers are written to the configuration registers block generates a 2 
cycle pulse (cpw_io_Hr) which is used to transfer the pin control signals from the pclk domain to the phiclk 
domain. The cpujto^wr signal is a delayed version of tiie write enable from the CPU. 



32.9.4 Dot counter 



The dot coimter keeps a running count of the number of dots fired for each color plane. The counters are 
32 bits wide and will saturate. When the CPU wants to read the dot coimt for a particular color plane it 
must write to the DotCountSnap register. This causes all 6 running counter values to be transferred to the 
DotCount registers in the configuration registers block. The nmning counter values are reset * 

// reset if being snapped 
if <dot_cnc_snap == 1) then{ 

dot_count {5:0] « accianudot_count (5 :0] 

accunK.dot_count [5 :0] s o 

) 

// update the counts 

for <color=0; color < 6;color++) { 

if (accum__dot_count (color) != OxfffC_ffff) { 
// data valid, first dot stream 

data_valid = ( (phi_llu_ready [0] =- 1) AND (llu_phi_avail ( 0) == 1)) 
if <(data_valid == 1) AND (llu_j>hi_data(0) (color] 1)) then 
accuitv_dot_count (color] 
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// data valid, second dot stream 

data_valid = ( (phi_llu.ready CIJ »c i) AND (llu_phi_avail U) == 
if ((data.valid == X) AND (llujhi.data tl] IcolorJ « 1)) then 
accuaq_dot_count [color] -i-i- 

) 



D) 



32.9,5 Sync generator 



The sync generator logic has two modes of operation, master and slave mode. In master mode (configured 
by the FhiMode register) it generates the Isyncl^o output based on configured values and contn>l triggers 
from the PHI controller. In slave mode it de-glitches the incoming IsynclJ signal, and filters the Isyncl sig- 
nal with the minimum configured period. 



byncCovl 



flvnc en«°l ANP 

count* 



(syncLo«1 



CQUnti^ AND last Irm 



count 



11 



count visyncjow 



count f»Q 
count- 



Isvnc DUlae ■» t 
oount* 
IMJM 



SyncLow ^ byncco-o r ^yncPeriod^ 



Mactiina remains in same stata by del^ult 
All outputs are zero untass otherwise stated 
State Oescriptiori: 
Reset Normal reset state 

SyncPre: Count the (.syncPre number of dock cydes 
SyncLow: Count the Lsyncl.ow number of dod( 
cydes 

SyncHlgh: Count the LsyrK*Hlgh number of dodc 
cydes 

SyncWait Walt for an input isync pulse 

SyncPeriod: Count the t-syncMinperiod number of dock 
cydes 



count - l8ync_Nah 



ooufiC « lsyncj(nin_pertod 



lsyncl.o« 1 



lb Reset Sate 



CQtJnt«=0 AND l«l gpe =»1 



Figuro 244. Sync generator state diagram 

After reset or a pulse on phi_go_pulse the machine returns to the Reset state, regardless of what state it's 
currently in. 

The state machine waits until it's enabled (jync_€n==l) by the PHI controller state machine. When 
enabled it can proceed to the SyncPre or SyncWait depending on whether the state machine is configured 
in master or slave mode. In master mode it generates Uie Isyncl pulses, in slave mode it receives and filters 
the Isyncl pulses from the master sync generator. 

On transition to the SyncPre state a counter is loaded with the LsyncPre value, and while in the SyncPre 
the coimter is decremented. When the count is zero the machine proceeds to the SyncLow state pulsing the 
lineal signal on transition and loading the counter with LsyncLow value. This indicates to the PHI con- 
troller the line start aligned to the Isyncl negative edge. 
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The machine waits in the SyncLow state until the counter has decremented to zero. It proceeds to the Syn- 
cHigh state and counts LsyncHigh number of cycles. While in LsyncLow state the IsyncLo output is set to 
0 and in SyncHigh the Isyncljo output is set to 1 . 

When the count is zero and the current line is not the last (lastjine — 0), the machine returns to the Syn- 
cLoMf state to begin generating a new line sync pulse. The transition pulses the line^t signal to the PHI 
controller. 

The loop is repeated until the current line is the last (lastjine «=1), and the machine returns to the Reset 
state to wait for the next page start 

In slave mode the state machine proceeds to the SyncWait state when enabled It waits in this state until a 
Isync^lse is received from the input de-glitch circuit. When a pulse is detected the machine jumps to the 
SyncPeriod state and begins counting down the LsyncMinPeriod number of clock cycles before returning 
to the SyncWait state. On transition from the Sync Wait to the SyncPeriod state the line jst signal to the PHI 
controller is pulsed to indicate the line start. While in the SyncPeriod state if a Isync^lse is detected the 
state machine will signal a sync enor (via sync^err) to the PHI controller and cause a buffer undenun 
interrupt. 



32,9.5.1 LsyncI input de-glitch 

The isync_i input is considered an asynchronous input to the PHI, and is passed through a synchronizer to 
reduce the possibility of metastable states occurring before being passed to the de-glitch logic. 

The input de-giitch logic rejects input states of duration less than the configured number of clock cycles 
{lsync_deglitch_cnt), input states of greater duration arc reflected on the output, and are negative edge 
detected to produce the Isync^lse signal to the main generator state machine. The counter logic is given 

if ( lsync_i != lsync_i_del«yj Chen 

cnt « l8ync_deglitclv_cnt 

output^en « 0 
elsif (cnt 0 I then 

cnt = cnt 

output_en = 1 
elae 

cnt — 

output^en = 0 



IsyncJ . 



synchonfTOf 



lsync_L<telay ["^ 



Counter 
Logic 



< 



Isync_€l9fl8teh^Gnt • 



Compare 



Pulse 
Generator 



output en 



tsyncjxjls© 



Figure 245. Line sync de-glitch RTL diagram 



3Z9.5.2 Line Sync Interrupt logic 

The line sync intemipt logic counts the number of line syncs that occur (either inteinally or externally gen- 
erated line syncs) and determines whether to generate an interrupt or not. The number of line syncs it 
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counts before an interrupt is generated is configured by the LineSyncInterrupt register. The interrupt is dis- 
abled if LineSyncInterrupt is set to zero. 
// ijnplemeiiC the interrupt counter 
i£ (phi_so_pulse esi) then 

line_count = 0 
elsif (line^st == l) AND (line_count == 0) > then 

line^count = Iinecount_int 
elsif ((Xine_st == 1) AND (line_count 0)) then 

line_count — 
// determine when to pulse the interrupt 
if {linesync_int 0 ) then // interrupt disabled 

phi_icu_line3ync^int = 0? 
elsif ((line.st == X) AND (line_count == 1)) then 

phi_icu_linesync_int « I 

32.9.6 Fire generator 

The fire generator blodc creates the signal profile for the phijrclk and phi jsrofile signals to the piintfiead. 
The profile is based on configured values and is timed in relation to the fire _^nc pulse from the PHI con- 
troller block. 



Machine remains in same state t>y default 
All outputs are zero unless otherwise stated 

State Description: 

Reset: Normal reset state 

RrePre: Count the FrdkPre number of dock cycles, 
repeat count set to FrdkNum 

RreHigh: Count the FrdkHigh number of dock cydes 

FireLow: Count the Rrdkljow number of dock cydes 



Figure 246. Fire generator state diagram 

The fire generator consists of 2 identical state machines for creating the phi Jrclk^d phi_prvfile signals 
respectively. 

The machine is reset to the Reset state when phi^go^ulse =1 or the reset is active, regardless of the cur- 
rent state. 

The machine waits in the reset state until it receives a fire^st pulse from the PHI controller. The controiler 
will generate ^fire_^t pulse at the beginning of each dot line. On the state transition the cycle counter is 
loaded with the FrclkPre value and the repeat counter is loaded with the FrdkNum value. 



Reset OR nht oft ml«>— i 



Reset ^ 



ftm_fdy- 1 



count B ficft^pra 



CQtmtl-Q 
count- 



pN_rici(«o 



count • licCKJiioh 
^ ^ ^ fop m jOOunt a 6cH^jnum 



eounlNQ 
oount 



courttwQ 



count - frcauow 



eountf»0 
coum- 



FireLow ^ 



pMjrck-o 



CBunt°"QAND 



repeat enunt«»0 



Doc: SoPEC_hardware_dGsign 

Version: 2.3 



S3 Proprietary Document 



^ Nov 2002 
Page 521 




SoPEC : Hardware Design 




The state machine waits in the FirePre state until the cycle counter is zero, after which it jumps to the Fire- 
High state and loads the cycle counter with FrclkHigh value. Again the state machine waits until the count 
is zero and then proceeds to the FireLow state. On transition the cycle counter is loaded with the FireLow 
value. The state machine waits in the FireLow state while the cycle counter is decremented. 

When the cycle counter reaches zero and the repeat jcount is non-zero, the repeat_couiu is decremented, 
the cycle counter is loaded with the FrclkHigh value and the state machine jumps to the FireHigh state to 
repeat the pA/jfc/*^ generation cycle. The loop is repeated until the repeat^count is zero. In such cases the 
state machine goes to the reset state and waits for the next fire_st pulse. 

When in the Reset state ihtfire^rdy signal is active to indicate to the controller that the fire generator is 
ready. 



32.9.7 PHI controller 

The PHI controller is responsible for controlling all functions of the PHI block on a line by line basis. It 
controls and synchronizes the sync generator, the fire generator, and datapath unit, as well as signalling 
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back to the CPU the PHI status. It also contains a line counter to determine when a full page has completed 
printing. 

■ Reset QR Phi oo oufstei 

\ 

»^ Reset 



c 



Phi yOaasI 

ilnCjboimt cpageJeruCne 



RrstUne 



> 



fine 



1 AMD 



data fin=1 
Une^oount - 



^PrintstarT^ 



Prtnt gtaitsgl 



data nn-^T ANT? 

fine count <oa^ Ian n>w> 

One_jcoiint- 



SyncWait ^ 



syncenei 



data_$tc 1 
aro.st « 1 
sync^sto 1 



Una fit«.1 A^f> 1^ 



■ f UneTrans 1 " >r Underrun ^ 

^ 1 '^sync_en =1 V y 



data fin= 1 AMP 
lir>ft count ^ t 
Gne.count- 



fife fdy«^t 



(tastUne j lastjmaoi 



Figure 247. PHI controller state machine 

The PHI controUer state machine is reset to Reset state by a reset or phi_go ^ndse = 1 . 
It will remain in reset until the block is enabled by phi^o 1 . Once enabled the state machine will jump 
to the FirstLine state, trigger the transfer of one line of data to the printhead {data^t 1) and the line 
counter wiU be initialized to the page length (PageLenLine). Once the Une is transferred (data Jin from the 
datapath unit) the machine will go to Printstart state and signal the CPU using an interrupt that the PHI is 
ready to begin printing (phijcu^nnt^rdy). The line counter will also be decremented. It will then wait in 
the Printstart state until the CPU acknowledges the print ready signal and enables printing by writing to 
the Printstart register. 

The state machine proceeds to the SyncWaU state and waits for a line start condition {line^t =1). The line 
start condition is different depending on whether the PHI is configured as being in a master or slave 
SoPEC (the PhiMode register). In either case the sync generator determines the correct line start source 
and signals die PHI controller via the line_ft signal. Once received the machine proceeds to the LineTrans 
state, with the transition triggering the fire generator to start (fire^t), the datapath unit to start (data^st) 
and the sync generator to start {sync_sty 
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While in the Lineiy-ans state the fire, sync and datapath unit will be producing line data. When finished 
processing a line the datapath unit will assert the line finished {line Jin) signal. If the line counter is not 
equal to 1 {i.e. not the last line) the state machine will jump back to the SyncWait state and wait for the start 
condition for the next line. The line counter will be decremented If the line counter is one th«i the 
machine will proceed to the LastLine state. 

The LastLine state generates one more line of fire pulses to print the last line held in the shift registers of 
the printhcad. Once complete {firejin =1) the state macWne returns to the reset state and waits for the 
next page of data. On page completion the state machine generates a phijcu ^agejinish interrupt to sig« 
nal to the CPU that the page has completed, thephijcu^agejinish will also cause the Go register to reset 
automatically. 

While die state machine is in the LineTrans state (or in FirstLine state and the PHI is in slave mode) and 
waiting for the datapath unit to complete line processing, it is possible (e.g. an excessive PEP stall) that a 
new line start condition occurs b\it the datapath unit is not ready. In this case an undenun eixor is gener- 
ated. The state machine goes to the Underrun state and generates a pkijcujtinderrun interrupt to the 
CPU. The PHI cannot recover from a buffer undenun error, the CPU must reset the PEP blocks and re- 
start printing. The phijcu junderrun will also cause the Co register to reset automatically. 

32.9.8 CPU lO control 

The CPU 10 control block is responsible for accepting CPU direct lO control signals from the configuia- 
tion registers (atpc/A: frequency) and transferring them to phicik frequency. It also accepts the input signals 
from the printhead and re-synchronizes them to the pclk domain, and debug signals from the RDU and 
muxes them to output pins. 

Table 161 contains the direct mq)ping of configuration registers to printhead lO pins* Direct CPU control 
is enabled only when PrintHeadCpuCtrl is set to one. In normal operation (i.e. PrintHeadCpuCtrl — 0) 
the printhead data pins are always in output mode (phi^hJUita^e = 1 ), the phijsyncl will be in output if 
the SoPEC is the master, i.e, phijsyncl_e = phi^mode, and readi will be set high. 

The pseudocode for the CPU lO control is: 

if (printhea<jLcpu_ctrl «» I) then // CPU access enabled 
// outputs 

phi_ph_data_o[0] {1:03 » printheaa_cpxJL-OUtll;0J 
phi_ph_data_o C 1 J { 1 : 0 3 = pr intheadUcpu^out ( 3 : 2 J 
phi_lsyncl_o « pr inthead_cpu_out (43 

phi.readl = printheadLcpu_out(53 

phi_erclk(l:0] « printheacL.cpu_cut (7s6] 

phi^frclk =.printhead_cpu_out(83 
phi_profile = printhead_cpu_out [9] 

// direction control 

Pt^i-ph_data_eCOni:0] » printhead_cpu_dir ( 1 : 0] 
phi_ph_data_e C 1 1 1 1 : 0 J = printheadLcpu_dir ( 3 : 2 J 
phi_lflyncl_e « printhead_cpu_dir (4) 

// input assignments 

printhead_cpu_intl : 0] c synchronize (phi_ph_data_i [0) [1 : 0) ) 
printheati_cpu_in(3:21 = synchronize (phi_ph_data_i ( 1 J (1 : 0] ) 
printhead__cpu_inr5J = synchronize (phi_lsyncl_it 03 (1:0) ) 
else // normal connections 
// outputs 

philj)h_data_o (01(1:0) = ph_data (0 } ( 1 : 0 ] 
phi_ph_data_ori] (1:0] « ph^data (1) (1 :01 
phi_lsyncl_o = lsync_o 

phi^readl a i 

phi_srclk(l:03 = srclk{l:03 

Phi_frclk e frclk 

phi_pro£ile » pro£ile 

// direction control 
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phi_ph_data_e[0) [1:0) = 0x3 
Phl«ph_data_e[l] C1:0] = 0x3 

phi_layncl.e = phi_mode // depends on Master or Slave mode 

// inputs 

lsyncl_i e phL_lsync_i // connected regardless 

// debug overrides any other connections 
if <debug_cntrl£0) 1> then 

Phi_frclk « debug.date_outIO) 

phi^readl = pclk 

if (debug_cntrl(lj ~= l) then 

phij>rofile = debug_data.out ( 1 ] 

if (debug_cntrlC2] == 1) then 

phi_lsyncl_o = debug_data_out 1 2 1 

phi_lsyncl_e = 1 

The debug signalling is controlled by the RDU block (see Section 1 1.8 Realtime Debug Unit (RDU)), the 
lO control in the PHI muxes debug data onto the PHI pins based on the control signals from the RDU. 
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32.9.9 Datapath Unit 



dotjoaSGr.jiiodo 



prinUMat^sizeCI}. 
dotjinaigln(i] 



phl_98ffal.Ofdef(l] 




I * pdk domain (160 Mhz) 



I dodk domain (320 Mhz) 



I pNdk domain (106 Mhz) 



Figure 248. Datapath Unit partition 
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32.9.10 Dot order controller 

RaaetORpM no oulsfe«.i 



Reset ^ 



dot_order_rdy 



data st«=i 
doL.cnt.rst « 1 



mode^sal » dot_order_fnoda 



CUneStart J gen^enpi o -(mid_ptfOI} 



mid ptfl!Ofagii 



Madiine remains in same state by default 
All outputs are zero unless otherwise stated 

State Description: 

Reset Normal reset state 

Unestart: Start processing first part of the line, wait for 
lx>th mid_pt to be active 

UneMid: Switch over wait state atlow pipeline to dear 

UneEnd: Line end processing wait for t>oth fine_fin to be 
active 



^ UneMid"^ 



j-^ UneEnd"^ 



modd^sel » dot.order.mode 
gen_enI01 e o 



mode.sel « ^(douo/d8r_mode) 
gefuenlO] » mid jnfoi 
Oen_en|l] « mfdjiqt] 



Figure 249. Dot Order controller state diagram 



The dot order controller is responsible for controlling the dot order blocks. It monitors the status of each 
block and detemnines the switch over point, at which tiie connections from odd and even dot streams to 
printhead channels are swapped. 

The machine is reset to the Reset state when phi^o_pulse — 1 or the reset is active. The machine will 
wait until it receives a data_st pulse from Ae PHI controller before proceeding to the LineStart state. On 
the transition to the LineStart state it will reset the dot counter in each dot order block via the dot_cnt_rst 
signal. 

While in the LineStart state both dot order blocks are enabled (gen_en^\). The dot order blocks process 
data until each of them reach their mid point. The mid point of a line is defined by the configured printhead 
size (i.e. print_head^size). When a dot order block reaches the mid point it immediately stops pix>cessing 
and waits for the remaining dot order block. When both dot order blocks are at the mid point {mid^t — 
1 1) the controller clocks through the LineMid state to allow Ae pipeline to empty and immediately goes to 
LineEnd state. 

In the LineEnd state the mode^el is switched and the dot order blocks le-enabled, in this state the dot 
order blocks are reading data from the opposite LLU dot data stream as in LineStart state. The controller 
remains in the LineEnd state until both dot order blocks have processed a line i.e. line^fin =11. 

On completion of both blocks the controller returns to the Reset state and again awaits the next data^t 
pulse from the PHI controller. When in Reset state the machine signals the PHI controller that it*s ready to 
begin processing dot data via the dot^order^rdy signal. 
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The dot order controller selects which dot streams should feed which printhead channels. The order can be 
changed by configuring the DotOrderMode register. In all cases Channel A and Channel B must be in 
opposing dot order modes. Table 158 shows the possible modes of operation. 



Table 16i5. Mode selection In Dot order controller. 











A 


0 


0 


Even before Odd (EBO mode), even dot stream feeds 
Channel A printhead, first half Ime. 




0 


1 


Odd liefore Even (OBE mode), odd dot stream feeds 
Channel A printhead. first half line. 




1 


0 


Even before Odd (EBO mode), even dot stream feeds 
Channel A printhead, second haft line. 




1 


1 


Odd before Even (OBE mode), odd dot stream feede 
Channel A printhead. second half line. 


B 


0 


0 


Odd before Even (OBE mode), odd dot stream feeds 
Channel B printhead, second half line 




0 


1 


Even before Odd (EBO mode), even dot stream feeds 
Channel B printhead. secorKl half line. 




1 


0 


Odd before Even (OBE nrK>de). odd dot stream feeds 
Channel B printhead. first half line. 




1 


1 


Even before Odd (EBO mode), even dot stream feeds 
Channel B printhead, firet half line. 



3Z9.10.1Dot order unit 

The dot order control accepts dot data from either dot stream from the LLU and writes the dot data into the 
dot buflFer. It has two modes of operation, odd before even (OBE) and even before odd (EBO). In the OBE 
mode data from the odd stream dot data is accepted first then even, in EBO mode it's vice versa. The mode 
is configurable by &e DotOrderMode register: 

The dot order unit maintains a dot coimt that is decremented each time a new dot is received from the 

I LLU- The dot order controller resets the dot counter to the print_head_sizefI5:0J at the start of a new line 
via the dot_cnt_rst signal. The dot count is compared with the printhead size {print_head^ize[l5:0] 
divided by 2) to determine the mid point (mid_j)t) and the line finish point (iine^Jin) when the dot counter 
is zero. 

The mid point is defined as the half the number of dots in a particular printhead, and is given by the 
I printjieadjsize bus. 

// define the mid point 
I if (dot_cnt[i5:0] print_hda4-8ize(15:l) }then 

xnid^pt = 1 
else 

mid_pt ~ 0 

The dot order unit logic maintains the dot data write pointer. Each time a new dot is written to the dot 
buffer the write pointer is incremented. The fill level of the dot buffer is determined by colI^laring the read 
and write pointers. The fill level is used to detenxiine when to backpressure the LLU {ready signal) due to 
the dot buffer filling. A suitable threshold value is determined to allow for the full LLU pipeline to empty 
into the dot buffer. 

The dot order stalling control is given by: 

// determine the ready/avail signal to use, based on mode select 
if (mode_sel 1) then 

<iot__active = llujphi.avail [01 AND ready 

wr_data = llu_phi_data[0J 
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else 

dot_active = llu_phi_avail fl] AND ready 

wr_data = llu_phi_datall] 
// update the counters 
if (detractive == 1) then { 

wr_en = 1 

%re-^adr 

if (dot_cnt aa 0) then 
I dot_cnt « print_head_si2e 

else 

dot_cnt — 

) 

The dot writer needs to determine when to stall the LLU dot data stream. A number of factors could stall 
the dot stream in the LLU such as buifer filling, waiting for the mid point, waiting for the line finish or the 
dot order controller is waiting for the line start condition from the PHI contix>!ler. 

The stall logic is given by: 

// determine when to stall the LLU generator 
£ill_level = imr^adr - rd:_adr 

if (fill_level > (32 - THRESHOLD ) ) then // THRESHOLD is open value TBD 

ready =0 // buffer is close to full 

els if < gen_en 0) then 

ready = 0 // stalled by the datapath controller 

else 

ready =1 // everything good no stall 
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32.9.10.2 Data generator 



-* (^ Reset ^ 



count « •fcflu)n)Ltoad 



r ^ SrcikPre^ 




Machine remains in same state by default 
All outputs are zero untess otherwise stated 
State Description: 
Reset: Normal reset state 
SrdlcPre: Count the SrcIkPre number of dock cycles 
OataGen: Read Une Dot data from buffer 
MarginQen: Generate DotMargin number of dots 
SrdkPost: Wait for SrdkFost number of cydes 



Figure 250. Data generator state cfiagram 

The data generator block read[s data from the dot buffer and feeds dot data to the piinthead at a configured 
rate (set by the PrintheadRate). It also generates the margin zero data and aligns the dot data generation to 
the synchronization pulse from the PHI controller. 

The data generator controller waits in Reset state until it receives a line start pulse from the PHI controller 
{datajst signal). Once a start pulse is received it proceeds to the Srx:lkPre state loading a counter with the 
SrclkPre value. While in this state it decrements die counter. No data is read or output at this stage. When 
the count is zero the machine proceeds to the DataGen state. 

On transition it loads the counter with the printhead size (print Jiead_size), If margining is to be used then 
the configured printjieadjsize should be adjusted by the dot margin value i.e. pnntjiead ^ize = 
(physicai^rint_Jiead_size - (dot^margin * 2)). " 

While in DataGen state data is read from the dot buffer and output to the printhead. The counter will dec- 
rement for every dot data word transferred. The exact rate is dictated by the dot buffer fiU levels and the 
configured printhead rate (PrintheadRate). 

The generator determines the rate by incrementing a rate counter (rate^cnt) whUe in the DataGen state. 
The rate counter is allowed to wrap normally. If the bit selected by the rate^cnt in the print Jiead^rate bus 
IS one data is transferred, otherwise the cycle is skipped. If the PrintHeadRate is set to all zeros then no 
data will ever get transferred. The pseudo-oode for the DataGen state is given by: 
// increment the rate count 
ra tenant 

// determine i£ data should be read 
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// first determine if data ia available in buffer 
if (rdLadr != wr_adr ) then 

if (print_head_j:ate[rate_cnt) == 1 ) then 

dot_active = 1 

gate_srclk = 1 

rd«adr ♦+ 

dot^data = rd_data 
count — 
else 

doc^active - 0 
gate.srclk » 0 

else 

dot_ac.tive « 0 
gate^srclk o o 

When the counter reaches zero the state machine will jump to the MarginGen state if the configured mar- 
gin value is non-zero, otherwise it will jump directly to the SrclkPost state. On transition to MarginGen 
state it loads the cycle counter with the dot^margin value, and begins to count down. While in the Margin- 
Gen state the data generator logic block writes dot data to the printhead but does not read from the dot 
buffers. It creates zero dot data words for the margin duration. 

When the counter reaches zero the machine jun^s to the SrclkFost state, loads the clock counter with the 
SrclkPost value and decrements. When the count is finished the state machine returns to the Reset and 
awaits the next start pulse. Should a line sync arrive before the data generators have completed {fiatajin 
signal) the PHI controller will detect a print error and stall the PHI interface. 



3Z9,iO,3Data seriallzer 



The data serializer block converts 6-bit dot data at phiclk rates (nominally 1 06 MHz) to 2-bit data at doclk 
rates (nominally 320 MHz). 



phlctk 



dodk 



dot_data[5:0} 

ph_data(1X}] 
mux^eel 

gate.srdk 
gate_&rclk_€iel 
srdk 



InvalM 



VaIId[5:0] 



") ( ValMIS.'O] X Invalid . ) ( 



■LTLrLrLrLTLr 



Figure 251. Data senalizer timing 

The srclk is only active when data is available for transfer to the printhead, as enabled by the gatejsrclk 
signal, The data rate mechanism in the data generator block will mean that data is not transferred to the 
printhead on every phiclk cycle. Both the dotjiata and gatejsrclk signals are clocked out by the phiclk and 
can only change on the rising of phiclk 
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The data serializer block allows easy separation of clock gating and clock to logic stiuctures from the rest 
of the PHI interface. All registers in the block are clocked at doclk rates. 



phead.swap ■ 

dot.data[0H5:0] - 

dot_data[in5:0] - 



phtdk- 
phLserial.onler- 



Mux logic 



ptieadLsMap- 
QatB.fifdkfOl - 

Oate.srdkfl] . 



docflc ' 



doL.daia(1:0] 



dot.dataf3:21 ^ 



dol_datarS:4| ^ 



mux sal 



flate_sfdk del 



ph.data(1:l 



late 



" srdk 



Figure 252. Data serializer RTL Diagram 

The mux logic determines which data bits from the doijdata bus should be selected for output on the 
ph^data to the printhead. The selection is dependent on the phiclkedge, 
if (phlclk 1) then 

zxiux_sel s 1 • 
el&lf ( muai^sel 2 } then 

miu^sel = 0 
else 

The dot data serialization order can be configured by PhiSerialOrder register. If the PhiSerialOrder is zero 
the order is dot [1:0]^ then dot [3: 2] then dot{5:4J. If the register is one then the order is dot [5:4]^ dot [3:2]^ 
dot[l:0]. 
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33 Test Units 



33.1 JTAG INTERFACE 



A standard JTAG (Joint Test Action Group) Interface is included in SoPEC for Bonding and lO testing 
purposes. The JTAG port will provide accdss to aU internal BIST (Built In Self Test) structures. 



33.2 Scan Test I/O 



The SoPEC device will require several test lO's for running scan tests. In general scan in and scan out pins 
will be multiplexed with functional pins. 

33.3 Analog TEST Units 

33.3.1 USB PHY Testing 

The USB phy analog macro, will contain built-in in test structure, which can be access by either the CPU 
or through the JTAG port 

33.3.2 Embedded PLL Testing 

The embedded clock generator PLL will require test access from JTAG port. 
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34 SoPEC Pinning and Package 



34,1 Overview 

It is intended that the SoPEC package be a 100 pin LQFP. Any spare pins in the package may be used by 
increasing the number of available GPIO pins or adding extra power and ground pin. The pin list shows the 
minimum pin requirement for the SoPEC device. 



Tabfe 166. SoPEC Pin Ust 









JIM 




mmm^mmm 


Clocks and reset 


8 






xtalln 


1 




TBO 


N/A 


xtalfn 




xtalout 


1 


O 


TBD 


N/A 


xtalout 


Crystal output frin 


resacn 


1 


1 


LVTTL 


2.5V 


res6t.n 


Asynchronous active low reset 










ptUdaia[0][0] 


2 


o 


LV08 




phLpK.data_o[0][0] 


Dot data lor colors 0-2 for Printfiead 0. 
Using differential signainng 






\ 


LVTTL 


3.av 


phi_ph_data.<0] 


Input niode bit used for noatzfe test 
result prlnthead 0 


pn-data|OKl] 


2 


o 


LVOS 


3-3V 


phLph_data-.o[OMiJ 


Dot data for colors 3-5 for Printhead 0. 
Using differential signalUng 






1 


LVTTL 


3.3v 


phlj)h.datB_|1] 


Input mods bH used fiDr temperature 
data prlnthead 0 


ph.dataI1][0J 


2 


o 


LVDS 


3.3v 


pt)Uptl.data_oC1][Q] 


Dot data tor cotors 0-2 for Printnead 1 . 
Using differential signaBIng 






1 


LVTTL 


3.3V 


phi_ph.daia.i[1] 


Input mode bit used for nozzle test 
result printhead t 


pli.data(inil 


2 


o 


LVDS 


3.3v 


pN^_datB.o(1Kl] 


Dot data tor colors 3-5 for Printhead 1. 
Using differential signallktg 








LVTTL 


3.3v 


phLpri.data_i[1| 


Input mods t>it used for temperature 
data prirtttiead 1 


srelk(Q] 


2 


o 


LVDS 


3.3V 


phLsrdhtcq 


Differential dot data shift dock for prim 
heado 


8rclk(1) 


2 


o 


LVDS 


3.3v 


phLsrclk(1] 


Differential dot data shift ctock for print 
headi 


readi 


1 


o 


LVTTL 


3.3v 


phLroadI 


Common Print head mode control 


fcdk 


1 


o 


Lvm. 


3.3V 


phi^frdk 


Common Hre pattern shStctock, needs 
to toggle once per fire cycfe 


profits 


1 


0 


Lvm. 


3.3V 


phLprofUe 


Common Pulse profile for ail ookxs 


teynd 


1 


o 


LVTTL 


3.3v 


phLIsyncf^o 


Une Sync output from Master to Slaves 








LVTTL 


3.3v 




Line Sync input to Slaves from Master 


USB Connections 






usbd 1 


2 


uo 


Differen- 
tial 


3.3v 


Direct Ph/ Connection j 


USB differential data 


JTAQ 








tdo 


1 


o 


CMOS 


2.5v 


tdo 


JTAG Test data out port 


(ms 


1 




CMOS 


2.SV 


tms 


JTAG Test mode select 


tdl 


1 


1 


CMOS 


2.5v 


tdi 


JTAG Test data In port 


tck 


1 


1 


CMOS 


2.5v 


tck 


JTAG Test access port clock 


General Purpose lO 
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IHble 166. SoPEC Pin List 

























2.5V 


gpto.o(3:0| 


Motor control pins / general purpose 
Output 





CMOS 


2^v 




Genera] purpose Input 


BPW:4| 


4 


O 


Drive 
CMOS 


2.5v 


gpk>_o[7:4] 


LEO driver pins /genemi purpose Out- 
put 


\ 


CMOS 


2^ 


gpioJ[7:4] 


Generat purpose input 




4 


0 


Open col- 
lector 


2l$v 


gpio.o(11:8l 


LSS inteitaoe pins / general purpose 
Output 


1 


CMOS 


2.SV 


Qpio_l(ll:8I 


LSS interface pJns / general purpose 


gpio[i3:i2] 


2 


o 


CMOS 


2^v 


gpiQ_o(13:12] 


iSi interlace pins / general purpose 
Output 


1 


CMOS 


2.Sv 


flpioJI13:12J 


fSI tnterfaca pins / general purpose 
input 


TestPlns 










test_enable 


1 


1 


CMOS 


2.5v 


TBO 


TestEnat)ie 


QOoeric_t8St 


5 


I/O 


CMOS 


2.5v 


TBD 


Generic test pin, function undefined 




Total Signal 
Pins 


45 






























gnd 


18 


1 


Power 


N/A 


gnd 


grxl 




"vdd 


10 


1 


Power 


N/A 


vdd 


vdd 1 .5v. core voltage 




vdd250 


3 


1 


Power 


N/A 


vdd250 


vdd 2,5v.IO voltage 






5 


1 


Power 


N/A 


vdd330 


vdd 3.3v,IO voltage 


1 Total Pins 


81 
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35 Memjet Printhead 

This section is quoted verbatim from SoPEC/MoPEC Bilithic Printhead Reference document [10]. 

35.1 Background 

Siiverbrook*s bilithic Memjet™ printheads are the target printheads for printing systems vv^hich will be 
controlled by SoPEC and MoPEC devices. 

This document presents the format and structure of these printheads, and describes the their possible 
arrangements in the target systems. It also defines a set of terms used to differentiate between the types of 
printheads and the systems which use them. 

35.2 Companion Documents 

Currently, this document is only concerned with the structure of the printheads and their systems, with 
regard to the way in which dot data is loaded. 

Refer to the Bilithic Printhead Specification [2] for the complete description of the functionality of these 
devices. 

This document relies on certain definitions and details presented in Bilithic Printhead Specification [2]. 

35.3 Definitions 

This document presents tenninology and definitions used to describe the bilithic printhead systems. These 
terms and definitions are as follows: 

• Printhead Type - There are 3 parameters which define the type of printhead- used in a system: 

• Direction of the data flow through the printhead (cloclcwise or anti-clockwise, with the printhead 

shooting ink down onto the page)« 

• Location of the left>most dot (upper row or lower row» with respect to K+ ). 

• Printhead footprint (type A or type B, characterized by the data pin being on the left or the right of 

where y+ is at the top of the printhead). 

• Printhead Arrangement - Even though there are 8 printhead types, each arrangement has to use a spe- 

cific pairing of printheads, as discussed in Section 3S.4. This gives 4 pairs of printheads. However, 
because the paper can flow in either direction with respect to the printheads, there are a total of eight 
possible arrangements, e.g. Arrangement 1 has a Type 0 printhead on the left with respect to the 
paper flow, and a Type 1 printhead on the right Arrangement 2 uses the same printhead pair as 
Arrangement 1, but the paper flows in the opposite direction. 

• Color 0 is always the first color plane encotmtered by the paper. 

• DotO is defined as the nozzle which can print a dot in the left*most side of the page. 
« The Even Plane of a color corresponds to the row of nozzles that prints dot 0. 

Note that throughout this document, where the various printheads and systems are presented^ the print- 
heads always shoot ink down onto the page. 

Figure 253 shows the 8 different possible printhead types. Type 0 is identical to the Right Printhead pre- 
sented in Figure 3 in [2], and Type 1 is the same as the Left Printhead as defined in [2]. 
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fVhile theprintheads shown in Figure 253 look to be of equal width (having the same number of nozzles) it 
is important to remember that in a typical system, a pair of unequal sized prirUheads may be used. 



Color n 
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Type 0 printhead 
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-e-e 



Color n 
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Type 1 printhead 
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O O 



Type 2 printhead 



-©-© 
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Type 3 printliead 

^ 



O OO 

r- 



O OK) 



Color n 



■e-e- 



Type 4 printhead 



Type 5 printhead 



MB O O 



Color n 



0<3 O 



-o-e 




Color n 



O 0 > Q 



O O Q< 



Type 6 printhead Type 7 printhead v 

Figure 253. Printhead Types 0 to 7 

Table 167 defines the printhead pairing and location of the each printhead type, with respect to the flow of 
paper, for the S possible arrangements 

Table 167. Definition of the different printhead arrangements 





^pnntl^a^[;pnlleftt8tde;^ 
^^^^^^^pe^^g^^ 


^E^rlntheaci^ar^dhtrSfde^^ 
^^^^^^^^^^^^^ 


Arrangement 1 


lypeC 


Type 1 


Arrangement 2 


Typel 


Type 0 


Arrangement 3 


Type 2 


Type 3 


Anangement 4 


Types 


Type 2 


Anrangement 5 


Type 4 


Type 5 


Arrangement 6 


Types 


Type 4 


Anangement 7 


Type 6 


Type 7 


Arrangement 8 


Type 7 


Types 
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35.4 BlUTHic Printhead Systems 



When using the bilithic printheads. the position of the power/gnd bars coupled with the physical footprint 
of the printheads mean that we must use a specific pairing of printheads together for printing on the same 
side of an A4 (or wider) page. e.g. we must always use a Type 0 printhead with a Type 1 printhead etc. 

While a given printing system can use any one of the eight possible arrangements of printheads, this docu- 
ment only presents two of them. Arrangement 1 and Arrangement 2, for purposes of illustration. These 
two arrangements are discussed in subsequent sections of this document. However, the other 6 possibilities 
also need to be considered. 

The main difference between the two printhead arrangements discussed in this document is the direction 
of the p^er flow. Because of this, the dot data has to be loaded differently in Arrangement 1 compared to 
Arrangement 2, in order to render the page correctly. 



35,4.1 Example 1 : Printhead Arrangement 1 

Figure 254 shows an Arrangement 1 printing setup, where the bilithic printheads are arranged as follows: 

• The T>t>c 0 printhead is on the left with respect to the direction of the paper flow. 

• The Type I printhead is oh the right 
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Type 0 Printhead 



Type 1 Printhead 




Gnd 



The printheads are facing downwanls. 
The ink is being shot down onto the page. Direction 

of Paper Flow 



Figure 254. Identification of printheads nozzres and shift-register sequences for printheads in 

Arrangement 1 

Table 168 lists the order in which the dot data needs to be loaded into the above prindiead system, to 
ensure color 0-dot 0 appears on the left side of the printed page. 



Table 168. Ofxler in which the even and odd dots are loaded for printhead Arrangement 1 







i^fefSon'tfJff'Hgfil^ 


Odd 


Loaded second in 
descending order 


Loaded first in 
descending order 


Even 


Loaded first in 
ascendfng order 


Loaded secorxi in 
ascending order 
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Data[0]- 



Figure 255 shows how the dot data is demultiplexed within the printheads. 

Type 0 Priiithead TVpe 1 Printhead 



Data[l]. 




.Data[0] 



-Data[l] 



Figure 255. Oemultrplexing of data within the printheads In Arrangement 1 

Figure 256 and Figure 257 show the way in wliich the dot data needs to be loaded into the printheads in 
Arrangement 1 , to ensure that color 0-dot 0 appears on the left side of the printed page. 

DatalO] ®©@>@©@®©^^^©©®©@> 
I>ata(l] 



Figure 256. Signalling for a Type 0 printhead in Arrangement 1 

Data[0] <Sh©©©©@@®^^ 



Rgure 257. Signalling for a Type 1 printhead in Arrangement 1 



35.4.2 Example 2: Printhead Arrangement 2 

Figure 258 shows an Arrangement 2 printing setup, where the bilithic printheads arc arranged as follows: 

• The Type 1 printhead is on the left with respect to the direction of the paper flow. 

• The lype 0 printhead is on the right 
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The piintheads are facing downwards. 
The ink is being shot down onto the page. 



Type 0 Printhead 



t t 



Direction 
of Paper Flow 



Type 1 Printhead 




Gnd 

Figure 258. Identification of printheads nozzles and shift-register sequences for printheads in 

Arrangement 2 

Table 169 lists the order in which the dot data needs to be loaded into the above printhead system, to 
ensure color 0-dot 0 spears on the left side of the printed page. 

Table 169. Order in which the even and odd dots are ioaded for printhead Arrangement 2 





^^^^^^^^^^^ 




Odd 


Loaded first in 
descending order. 


Loaded second in 
descending order 


Even 


Loaded second in 
ascending order. 


Loaded first in 
ascerKling onjer. 
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Figure 259 shows how the dot data is demultiplexed within the printheads. 
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Figure 259. Demultiplexing of data within the printheads in Arrangement 2 

Figure 260 and Figure 261 show the way in which the dot data needs to be loaded into the printheads in 
Anangement 2, to ensure that color 0-dot 0 appears on the left side of the printed page. 

Data[0] 
Data[l] 



Figure 260. Signalling for a Type 0 printhead in Arrangement 2 



DatafO] 
Datall] 

SrClk TJTJTJTJTJTJTJTJ^ 

Figure 261. Signalling for a Type 1 printhead in Arrangement 2 

35.4.3 Conclusions 

Comparing the signalling diagrams for Arrangement 1 with those shown for Arrangement 2, it can be seen 
that the color/dot sequence output for a printhead type in Airangement 1 is the reveise of the sequence for 
same printhead in Arrangement 2 in terais of the order in which the color plane data is output, as well as 
whether even or odd data is output first. However, the order within a color plane remains the same, i.e. odd 
descending, even ascending. 

From Figure 262 and Table 170, it can be seen that the plane which has to be loaded first (i.e. even or odd) 
depends on the arrangement. Also, the order in which the dots have to be loaded (e.g. even ascending or 
descending etc.) is dependent on the arrangement 
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If the device controlling the printheads can re-oider the bits according to the following criteria, then it 
should be able to operate in all the possible printhead arrangements: 

• Be able to output the even or odd plane first 

• Be able to output even and odd planes in either ascending or descending order, independently. 

• Be able to reverse the sequence in which the color planes of a single dot are output to the printhead. 
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Figure 262. All 8 Printhead Arrangements 



Table 170. Order in which even and odd dots and planes are loaded into the various printhead 
arrangements 





^^^^^^^^^^^^ 




Arrangement 1 


Even ascending loaded first 
Odd descendtng loaded second 


Odd descending loaded first 
Even ascendfng loaded second 


Arrangement 2 


Odd descendtng loaded first 
Even ascending loaded second 


Even ascending loaded first 
Odd descendir^ loaded second 
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Table 170. Order In which even and odd dots and planes are loaded fnto the various printhead 
arrangements 









Arrangement 3 


Odd ascending loaded first 
Even descending toaded second 


Even descending loaded first 
Odd ascending loaded second 


Arrangement 4 


Even descending loaded first 

Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Arrangement 5 


Odd ascending loaded first 
Even descending loaded second 


Even descending loaded first 
Odd ascending loaded secorKi 


Arrangement 6 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Anangement 7 


Even ascending loaded first 
Odd descending loaded second 


Odd descending loaded first 
Even ascending loaded second 


Arrangement 8 


Odd descending loaded first 
Even ascending loaded seoorxJ 


Even ascending loaded first 
Odd descefKJing loaded second 
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